Sequences of hepatitis C virus genotypes and their use as prophylactic, therapeutic and diagnostic agents

ABSTRACT

The present invention relates to new genomic nucleotide sequences and amino acid sequences corresponding to the coding region of these genomes. The invention relates to new HCV types and subtypes sequences which are different from the known HCV types and subtypes. More particularly, the present invention relates to new HCV type 7 sequences, new HCV type 9 sequences, new HCV type 10 and new HCV type 11 sequences. Also, the present invention relates to new HCV type 1 sequences of subtypes 1d, 1e, 1f and 1g; new HCV type 2 sequences of subtypes 2e, 2f, 2g, 2h, 2I, 2k and 2l; new HCV type 3 sequences of subtype 3g, new HCV type 4 sequences of subtypes 4k, 4l and 4m; a process for preparing them, and their use for diagnosis, prophylaxis and therapy. More particularly, the present invention provides new type-specific sequences of the Core, the E1 and the NS5 regions of new HCV types 7, 9, 10 and 11, as well as of new variants (subtypes) of HCV types 1, 2, 3 and 4. These new HCV sequences are useful to diagnose the presence of HCV type 1, and/or type 2, and/or type 3, and/or type 4, and/or type 7, and/or 9, and/or type 10, and/or type 11 genotypes or serotypes in a biological sample. Moreover, the availability of these new type-specific sequences can increase the overall sensitivity of HCV detection and should also prove to be useful for prophylactic and therapeutic purposes.

[0001] The invention relates to new sequences of hepatitis C virus (HCV) genotypes and their use as prophylactic, therapeutic and diagnostic agents.

[0002] The present invention relates to new genomic nucleotide sequences and amino acid sequences corresponding to the coding region of these genomes. The invention relates to new HCV types and subtypes sequences which are different from the known HCV types and subtypes sequences. More particularly, the present invention relates to new HCV type 7 sequences, new HCV type 9 sequences, new HCV types 10 and new HCV type 11 sequences. Also the present invention relates to new HCV type 1 sequences of subtypes 1d, 1e, 1f and 1g; new HCV type 2 sequences of subtypes 2e, 2f, 2g, 2h, 2i, 2k and 2l; new HCV type 3 sequences of subtype 3g, new HCV type 4 sequences of subtypes 4k, 4l and 4m; a process for preparing them, and their use for diagnosis, prophylaxis and therapy.

[0003] The technical problem underlying the present invention is to provide new HCV sequences from until now unknown HCV types and/or subtypes. More particularly, the present invention provides new type-specific sequences of the Core, the E1 and the NS5 regions of new HCV types 7, 9, 10 and 11, as well as of new variants (subtypes) of HCV types 1, 2, 3 and 4. These new HCV sequences are useful to diagnose the presence of HCV type 1, and/or type 2, and/or type 3, and/or type 4, and/or type 7, and/or type 9, and/or type 10, and/or type 11 genotypes or serotypes in a biological sample. Moreover, the availability of these new type-specific sequences can increase the overall sensitivity of HCV detection and should also prove to be useful for prophylactic and therapeutic purposes.

[0004] Hepatitis C viruses (HCV) have been found to be the major cause of non-A, non-B hepatitis. The sequences of cDNA clones covering the complete genome of several prototype isolates have been determined (Kato et al., 1990; Choo et al., 1991; Okamoto et al., 1991; Okamoto et al., 1992). Comparison of these isolates shows that the variability in nucleotide sequences can be used to distinguish at least 2 different genotypes, type 1 (HCV-1 and HCV-J) and type 2 (HC-J6 and HC-J8), with an average homology of about 68%. Within each type, at least two subtypes exist (e.g. represented by HCV-1 and HCV-J), having an average homology of about 79%. HCV genomes belonging to the same subtype show average homologies of more than 90% (Okamoto et al., 1992). However, the partial nucleotide sequence of the NS5 region of the HCV-T isolates showed at most 67% homology with the previously published sequences, indicating the existence of yet another HCV type (Mori et al., 1992). Parts of the 5′ untranslated region (UR), core, NS3, and NS5 regions of this type 3 have been published, further establishing the similar evolutionary distances between the 3 major genotypes and their subtypes (Chan et al., 1992). Type 4 was subsequently discovered (Stuyver et al., 1993b; Simmonds et al., 1993a; Bukh et al., 1993; Stuyver et al., 1994a). As well as type 5 (Stuyver et al., 1993b; Simmonds et al., 1993c; Bukh et al., 1993; Stuyver et al., 1994b), and type 6 HCV groups (Bukh et al., 1993; Simmonds et al., 1993c). An overview of the present state of the art regarding HCV genotypes is given in Table 3. The nomenclature system proposed by the inventors of the present application has now been accepted by scientists worldwide (Simmonds et al., 1994).

[0005] The aim of the present invention is to provide new HCV nucleotide and amino acid sequences enabling the detection of HCV infection.

[0006] Another aim of the present infection is to provide new nucleotide and amino acid HCV sequences enabling the classification of infected biological fluids into different serological groups.

[0007] Another aim of the present invention is to provide new nucleotide and amino acid HCV sequences ameliorating the overall HCV detection rate.

[0008] Another aim of the present invention is to provide new HCV sequences, useful for the design of HCV prophylactic or therapeutic vaccine compositions.

[0009] Another aim of the present invention is to provide a pharmaceutical composition consisting of antibodies raised against the polypeptides encoded by these new HCV sequences, for therapy or diagnosis.

[0010] All the aims of the present invention are met by the following embodiments of the present invention.

[0011] The present invention relates more particularly to an HCV polynucleic acid, having a nucleotide sequence which is unique to a heretofore unidentified HCV type or subtype which is different from HCV subtypes 1a, 1b, 1c, 2a, 2b, 2c, 2d, 3a, 3b, 3c, 3d, 3f, 4a, 30 4b, 4c, 4d, 4e, 4f, 4g, 4h, 4i, 4j, 5a or 6a, with said HCV subtypes being classified as in Table 3 by comparison of a part of the NS5 gene nucleotide sequence spanning positions 7932 to 8271, with said amino acid numbering being shown in Table 1, and with said polynucleic acid containing at least one nucleotide differing from said known HCV nucleotide sequences, or the complement thereof. The sequence of known HCV isolates may be found in any nucleotide sequence database known in the art (such as for instance the EMBL database).

[0012] The present invention thus also relates to a polynucleic acid having a nucleotide sequence which is unique to at least one of HCV subtypes 1d, 1e, 1f, 1g, 2e, 2f, 2g, 2h, 2i, 2k, 2l, 3g, 4k, 4l, 4m, 7a, 7c or 7d, with said HCV subtypes being classified as defined above.

[0013] The present invention thus also relates to a polynucleic acid having a nucleotide sequence which is unique to at least one of HCV types 9, 10 or 11, with said HCV types being classified as defined above.

[0014] It is to be noted that the nucleotide(s) difference in the polynucleic acids of the invention may involve an amino acid difference in the corresponding amino acid sequences encoded by said polynucleic acids. A composition according to the present invention may contain only polynucleic acid sequences or polynucleic acid sequences mixed with any excipient known in the art of diagnosis, prophylaxis or therapy.

[0015] According to a preferred embodiment, the present invention relates to a polynucleic acid encoding an HCV polyprotein comprising in its amino acid sequence at least one of the following amino acid residues: I15, C38, V44, A49, Q43, P49, Q55, A58, S60 or D60, E68 or V68, H70, A71 or Q71or N71, D72, H81, H101, D106, S110, L130, I134, E135, L140, S148, T150 or E150,Q153, F155, D157, G160, E165, I169, F181, L186, T190, T192 or I192 or H192, I193, A195, S196, R197 or N197 or K197, Q199 or D199 or H199 or N199, F200 or T200, A208, I213, M216 or S216, N217 or S217 or G217 or K217, T218, I219, A222, Y223, I230, W231 or L231, S232 or H232 or A232, Q233, E235 L235, F236 or T236, F237, L240 or M240, A242, N244, N249, I250 or K250 or R250, A252 or C252, A254, I255 or V255, D256 or M256, E257, E260 or K260, R261, V268, S272 or R272, I285, G290 or F290, A291, A293 or L293 or W293, T294 or A294, S295 or H295, K296 or E296, Y297 or M297, I299 or Y299, I300, S301, P316, S2646, A2648, G2649, A2650, V2652, Q2653, H2656 or L2656, D2657, F2659, K2663 or Q2663, A2667 or V1667, D2677, L2681, M2686 or Q2686 or E2686, A2692 or K2692, H2697, I2707, L2708 or Y2708, A2709, A2719 or M2719, F2727, T2728 or D2728, E2729, F2730 or Y2730, I2741, I2745, V2746 or E2746 or L2746 or K2746, A2748, S2749 or P2749, R2750, E2751, D2752 or N2752 or S2752 or T2752 or V2752 or I2752 or Q2752, S2753 or D2753 or G2753, D2754, A2755, L2756 or Q275, R2757,

[0016] with said notation being composed of a letter representing the amino acid residue by its one-letter code, and a number representing the amino acid numbering according to Kato et al. (1980), as shown in Table 1,

[0017] or a part of said polynucleic acid which is unique to at least one of the HCV subtypes or types as defined in Table 5, and which contains at least one nucleotide differing from known HCV nucleotide sequences, or the complement thereof.

[0018] Each of the above-mentioned residues can be found in FIGS. 2, 4 or 6 showing the new amino acid sequences of the present invention aligned with known sequences of other types or subtypes of HCV for the Core/E1 region.

[0019] According to another preferred embodiment, the present invention relates to a polynucleic acid encoding a HCV polyprotein comprising in its amino acid sequence at least one amino acid sequence chosen from the following list: ARQSDGRSWAQ or ARRSEGRSWAQ as for subtype 1d (SEQ ID NO 107 and 108) ERRPEGRSWAQ as for subtype 1e (SEQ ID NO 109) ARRPEGRSWAQ as for subtype 1f (SEQ ID NO 110) DRRTTGKSWGR as for subtype 2k (SEQ ID NO 111) DRRATGRSWGR as for subtype 2e (SEQ ID NO 112) DRRATGKSWGR as for subtype 2f (SEQ ID NO 113) VRQPTGRSWGQ as for type 9 (SEQ ID NO 114) VRHQTGRTWAQ as for subtype 7a and 7c (SEQ ID NO 115) VRQNQGRTWAQ as for subtype 7d (SEQ ID NO 116) ARRTEGRSWAQ as for type 10 (SEQ ID NO 117) VRRTTGRXXXX or VRRTTGRTWAQ as for type 11 (SEQ ID NO 118 and 119) HEVRNASGVYHV or HEVRNASGVYHL as for subtype 1d (SEQ ID NO 120 and 121) YEVHSTTDGYHV as for subtype 1f (SEQ ID NO 122) VEVKNTSQAYMA as for subtype 2e (SEQ ID NO 123) IQVKNNSHFYMA as for subtype 2f (SEQ ID NO 124) VQVKNTSTMYMA as for subtype 2g (SEQ ID NO 125) VQVKNTSHSYMV as for subtype 2h (SEQ ID NO 126) VQVANRSGSYMV as for subtype 2i (SEQ ID NO 127) VEIKNTXNTYVL or VEIKNTSNTYVL as for subtype 2k (SEQ ID NO 128 and 129) INYRNVSGIYYV or INYRNTSGIYHV or INYHNTSGIYHI or TNYRNVSGIYHV as for subtype (SEQ ID NO 130, 131, 132 4k or 133) QHYRNVSGIYHV as for subtype 4l (SEQ ID NO 134) IQVKNASGIYHL as for type 9 (SEQ ID NO 135) AHYTNKSGLYHL as for subtype 7c (SEQ ID NO 136) LNYANKSGLYHL as for subtype 7d (SEQ ID NO 137) LEYRNASGLYMV as for type 10 (SEQ ID NO 138) IYEMDGMIMHY or IYEMSGMILHA as for subtype 1d (SEQ ID NO 139 and 140) VYEAKDIILHT as for subtype 1f (SEQ ID NO 141) VWQLXDAVLHV as for subtype 2e (SEQ ID NO 142) VWQLRDAVLHV as for subtype 2f (SEQ ID NO 143) IWQMQGAVLHV as for subtype 2g (SEQ ID NO 144) VWQLKDAVLHV as for subtype 2h (SEQ ID NO 145) VWQLEEAVLHV as for subtype 2i (SEQ ID NO 146) TWQLXXAVLHV as for subtype 2k (SEQ ID NO 147) VYEADHHILHL or VYEADHHILAL or VFEADHHILHL as for subtupe 4k (SEQ ID NO 148, 149 and 150) VYESDHHILHL as for subtype 4l (SEQ ID NO 151) VFEAETMILHL as for type 9 (SEQ ID NO 152) VYEAETLILHL as for subtype 7c (SEQ ID NO 153) VYEANGMILHL as for subtype 7d (SEQ ID NO 154) VYEAGDIILHL as for type 10 (SEQ ID NO 155) VREDNHLRCWMAL or VRENNSSRCWMAL as for subtype 1d (SEQ ID NO 156 and 157) IREGNISRCWVPL as for subtype 1f (SEQ ID NO 158) ENSSGRFHCWIPI as for subtype 2e (SEQ ID NO 159) ERSGNRTFCWTAV as for subtype 2f (SEQ ID NO 160) ELQGNKSRCWIPV as for subtype 2g (SEQ ID NO 162) ERHQNQSRCWIPV as for subtype 2h (SEQ ID NO 163) EWKDNTSRCWIPV as for subtype 2i (SEQ ID NO 164) EREGNSSRCWIPV as for subtype 2k (SEQ ID NO 165) VREGNQSRCWVAL or VRTGNQSRCWVAL or VRVGNQSSCWVAL or VRVGNQSRCWVAL or (SEQ ID NO: 166, 167, 168 VKEGNHSRCWVAL as for subtype 4k or 169) VKTGNTSRCWVAL as for subtype 4l (SEQ ID NO 170) IKAGNESRCWLPV as for type 9 (SEQ ID NO 171) VKEGNQSRCWVQA as for subtype 7c (SEQ ID NO 172) VKXXNLTKCWLSA as for subtype 7d (SEQ ID NO 173) VRSGNTSRCWIPV as for type 10 (SEQ ID NO 174) VKNASVPTAA or VKDANVPTAA as for subtype 1d (SEQ ID NO 175 and 176) ARIANAPIDE as for subtype 1f (SEQ ID NO 177) VSKPGALTKG as for subtype 2e (SEQ ID NO 178) VSRPGALTRG as for subtype 2f (SEQ ID NO 179) VNQPGALTRG as for subtype 2g (SEQ ID NO 180) VSQPGALTRG as for subtype 2h (SEQ ID NO 181) VSQPGALTKG as for subtype 2i (SEQ ID NO 182) VSRPGALTEG as for subtype 2k (SEQ iD NO 183) APYIGAPLES or APYTAAPLES as for subtype 4k (SEQ ID NO 184 and 185) APILSAPLMS as for subtype 4l (SEQ ID NO 186) VPNSSVPIHG as for type 9 (SEQ ID NO 187) VPNASTPVTG as for subtype 7c (SEQ ID NO 188) VQNASVSIRG as for subtype 7d (SEQ ID NO 189) VKSPCAATAS as for type 10 (SEQ ID NO 190) SPRMHHTTQE or SPRLYHTTQE as for subtype 1d (SEQ ID NO 191 and 192) TSRRHWTVQD as for subtype 1f (SEQ ID NO 193) APKRHYFVQE as for subtype 2e (SEQ ID NO 194) SPQYHTFVQE as for subtype 2f (SEQ ID NO 195) SPQHHNFSQD as for subtype 2g (SEQ ID NO 196) SPQHHIFVQD as for subtype 2h (SEQ ID NO 197) SPEHHHFVQD as for subtype 2k (SEQ ID NO 198) RPRRHWTTQD or RPRRHWTAQD or QPRRHWTTQD or RPRRHWTTQE as for subtype 4k (SEQ ID NO 199, 200, 201 or 202) QPRRHWTVQD as for subtype 4l (SEQ ID NO 203) RPKYHQVTQD as for type 9 (SEQ ID NO 204) RPRMHQWQE as for subtype 7c (SEQ ID NO 205) RPRMYEIAQD as for subtype 7d (SEQ ID NO 206) RHRQHWTVQD as for type 10 (SEQ ID NO 207)

[0020] or a part of said polynucleic acid which is unique to at least one of the HCV subtypes or types as defined Table 5, and which contains at least one nucleotide differing from known HCV nucleotide sequences, or the complement thereof.

[0021] Using the 5′ non-coding LiPA system (Stuyver et al., 1993) and a new core LiPA system including multiple probes for subtypes 1a, 1b, 1c, 2a, 2b or 2c derived from the core region (Stuyver et al., 1995), samples from the Benelux, Cameroon, France and Vietnam were selected because of their aberrant reactivities (isolates CAM1078, FR2, FR1, VN4, VN12, VN13, NE98). Some samples were, together with many other samples, sequenced as a control for typing. Sequencing results, however, indicated the discovery of new subtypes (isolates BNL1, BNL2, BNL3, FR4, BNL4, BNL5, BNL6, BNL7, BNL8, BNL9, BNL10, BNL11 and BNL12). Nucleotide sequences in the core and E1 regions which have not yet been reported before, were analyzed in the frame of the invention. Genomic sequences of subtype 1d, 1e, 1f, 1g 2e, 2f, 2g, 2h, 2i, 2k, 2l, 3g, 4k, 4l, 4m, 7a, 7c, 7d and types 9, 10 and 11 isolates are reported for the first time in the present invention. The NS5B region was also analyzed.

[0022] The term “polynucleic acid” refers to a single-stranded or double-stranded nucleic acid sequence which may contain at least 5 contiguous nucleotides in common with the complete nucleotide sequence (e.g. at least 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 30 20, 21, 22, 23, 24, 25, 30, 35, 40, 45, 50, 60, 75 or more contiguous nucleotides). A polynucleic acid which is up till about 100 nucleotides in length is often also referred to as an oligonucleotide. A polynucleic acid may consist of deoxyribonucleotides or ribonucleotides, nucleotide analogues or modified nucleotides, or may have been adapted for therapeutic purposes. A polynucleic acid may also comprise a double stranded cDNA clone which can be used for cloning purposes, or for in vivo therapy, or prophylaxis.

[0023] The oligonucleotides according to the present invention, used as primers or probes may also contain or consist of nucleotide analogous such as phosphorothioates (Matsukura et al., 1987), alkylphosphoriates (Miller et al., 1979) or peptide nucleic acids (Nielsen et al., 1991; Nielsen et al., 1993) or may contain interculating agents (Asseline et al., 1984).

[0024] As most other variations or modifications introduced into the original DNA sequences of the invention these variations will neccissitate adaptions with respect to the conditions under which the oligonucleotide should be used to obtain the required specificty and sensitivity. However the eventual results will be essentially the same as those obtained with the unmodified oligonucleotides.

[0025] The introduction of these modifications may be advantageous in order to positivily influence characteristics such as hybridization kinetics, reversibility of the hybrid-formation, biological stability of the oligonucleotide molecules, etc.

[0026] The polynucleic acids of the invention may be comprised in a composition of any kind. Said composition may be for diagnostic, therapeutic or prophylactic use.

[0027] The expression “sequences which are unique to an HCV type or subtype” refers to sequences which are not shared by any other type or subtype of HCV, and can thus be used to uniquely detect that HCV type or subtype. Sequence variability is demonstrated in the present invention between the newly found HCV types and subtypes (see Table 5) and the known HCV types and subtypes (see Table 3), and it is therefore from these regions of sequence variability in particular that type- or subtypes-specific polynucleic acids, oligonucleotides, polypeptides and peptides may be obtained. The term type- or subtypes-specific refers to the fact that a sequence is unique to that HCV type or subtype involved.

[0028] The expression “nucleotides corresponding to” refers to nucleotides which are homologous or complementary to an indicated nucleotide sequence or region within a specific HCV sequence.

[0029] The term “coding region” corresponds to the region of the HCV genome that encodes the HCV polyprotein. In fact, it comprises the complete genome with the exception of the 5′ untranslated region and 3′ untranslated region.

[0030] The term “HCV polyprotein” refers to the HCV polyprotein of the HCV-J isolate (Kato et al., 1990). The adenine residue at position 330 (Kato et al., 1990) is the first residue of the ATG codon that initiates the long HCV polyprotein of 3010 amino acids in HCV-J and other type 1b isolates, and of 3011 amino acids in HCV-1 and other type 1a isolates, and of 3033 amino acids in type 2 isolates HC-J6 and HC-J8 (Okamoto et al., 1992).

[0031] This adenine is designated as position 1 at the nucleic acid level, and this methionine is designated as position 1 at the amino acid level, in the present invention. As type 1a isolates contain 1 extra amino acid in the NS5A region, coding sequences of type 1a and 1b have identical numbering in the Core, E1, NS3, and NS4 region, but will differ in the NS5B region as indicated in Table 1. Type 2 isolates have 4 extra amino acids in the E2 region, and 17 or 18 extra amino acids in the NS5 region compared to type 1 isolates, and will differ in numbering from type 1 isolates in the NS3/4 region and NS5b regions as indicated in Table 1. Similar insertions compared with type 1 (but of a different size) can also be observed in type 3a sequences which affect the numbering of type 3a amino acids accordingly. Other insertions or deletions may be readily observed in type1,2, 3, 4, 5, 6, 7, 8, 9, 10 and 11 sequences after alignment withknown HCV sequences. TABLE 1 Positions Positions Positions described in described for described for Positions described for the present HCV-J (Kato et HCV-1 (Choo HC-J6, HC-J8 Region invention* al., 1990) et al., 1991) (Okamoto et al., 1992) Nucleotides NS5B 8023/8235 8352/8564 8026/8238 8433/8645 7932/8271 8261/8600 7935/8274 8342/8681 coding region  330/9359   1/9033  342/9439 of present invention Amino Acids NS5B 2675/2745 2675/2745 2676/2746 2698/2768 2645/2757 2645/2757 2646/2758 2668/2780

[0032] Table 1: Comparison of the HCV nucleotide and amino acid numbering system used in the present invention (*) with the numbering used for other prototype isolates. For example, 8352/8564 indicates the region designated by the numbering from nucleotide 8352 to nucleotide 8564 as described by Kato et al. (1990). Since the numbering system of the present invention starts at the polyprotein initiation site, the 329 nucleotides of the 5′ untranslated region 5 described by Kato et al. (1990) have to be substracted, and the corresponding region is numbered from nucleotide 8023 (‘8352-329’) to 8235 (‘8564-329’).

[0033] The term “genotype” as used in the present invention refers to both types and/or subtypes.

[0034] The term “HCV type” corresponds to a group of HCV isolates of which the complete genome shows more than 73% preferably more than 74% homology at the nucleic acid level, or of which the NS5 region between nucleotide positions 7932 and 8271 shows more than 75.4% homology at the nucleic acid level, or of which the complete HCV polyprotein shows more than 78% homology at the amino acid level, or of which the NS5 region between amino acids at positions 2645 and 2757 shows more than 80% homology at the amino acid level, to polyproteins of the other isolates of the group, with said numbering beginning at the first ATG codon or first methionine of the long HCV polyprotein of the HCV-J isolate (Kato et al., 1990). Isolates belonging to different types of HCV exhibit homologies, over the complete genome, of less than 74%, preferably less than 73%, at the nucleic acid level and less than 78% at the amino acid level. Isolates belonging to the same type usually show homologies of about 90 to 99% at the nucleic acid level and 95 to 96% at the amino acid level when belonging to the same subtype, and those belonging to the same type but different subtypes preferably show homologies of about 76% to 82% (more particularly of about 77% to 80%) at the nucleic acid level and 85-86% at the amino acid level.

[0035] More preferably the definition of HCV types is concluded from the classification of HCV isolates according to their nucleotide distances calculated as detailed below:

[0036] (1) based on phylogenetic analysis of nucleic acid sequences in the NS5B region between nucleotides 7935 and 8274 (Choo et al., 1991) or 8261 and 8600 (Kato et al., 1990) or 8342 and 8681 (Okamoto et al., 1991), isolates belonging to the same HCV type show nucleotide distances of less than 0.34, usually less than 0.33, and more usually of less than 0.32, and isolates belonging to the same subtype show nucleotide distances of less than 0.135, usually of less than 0.13, and more usually of less than 0.125, usually ranging between 0.0003 and 0.1151, and consequently isolates belonging to the same type but different subtypes show nucleotide distances ranging from 0.135 to 0.34, usually ranging from 0.1384 to 0.2977, and more usually ranging from 0.15 to 0.32, and isolates belonging to different HCV types show nucleotide distances greater than 0.34, usually greater that 0.35, and more usually of greater than 0.358, more usually ranging from 0.3581 to 0.6670.

[0037] (2) based on phylogenetic analysis of nucleic acid sequences in the core/E1 region between nucleotides 378 and 957, isolates belonging to the same HCV type show nucleotide distances of less than 0.38, usually of less than 0.37, and more usually of less than 0.364, and isolates belonging to the same subtype show nucleotide distances of less than 0.17, usually of less than 0.16, and more usually of less than 0.15, more usually less than 0.135, more usually less than 0.134, and consequently isolates belonging to the same type but different subtypes show nucleotide distances ranging from 0.15 to 0.38, usually ranging from 0.16 to 0.37, and more usually ranging from 0.17 to 0.36, more usually ranging from 0.133 to 0.379, and isolates belonging to different HCV types show nucleotide distances greater than 0.34, 0.35, 0.36, usually more than 0.365, and more usually of greater than 0.37, TABLE 2 Molecular evolutionary distances core/E1 E1 NS5B NS5B Region 579 bp 384 bp 340 bp 222 bp Isolates* 0.0017 − 0.1347 0.0026 − 0.2031 0.0003 − 0.1151 0.000 − 0.1323 (0.0750 ± 0.0245) (0.0969 ± 0.0289) (0.0637 ± 0.0229) (0.0607 ± 0.0205) Subtypes* 0.1330 − 0.3794 0.1645 − 0.4869 0.1384 − 0.2977 0.117 − 0.3538 (0.2786 ± 0.0363) (0.3761 ± 0.0433) (0.2219 ± 0.0341) (0.2391 ± 0.0399) Types* 0.3479 − 0.6306 0.4309 − 0.9561 0.3581 − 0.6670  0.3457 − 0.7471 (0.4703 ± 0.0525) (0.6308 ± 0.0928) (0.4994 ± 0.0495) (0.5295 ± 0.0627)

[0038] Table 2 Figures created by the PHYLIP program DNADIST are expressed as minimum to maximum (average+standard deviation). Phylogenetic distances for isolates belonging to the same subtype (‘isolates’), to different subtypes of the same type (‘subtypes’), and to different types (‘types’) are given.

[0039] In a comparative phylogenetic analysis of available sequences, ranges of molecular evolutionary distances for different regions of the genome were calculated, based on 19,781 pairwise comparisons by means of the DNADIST program of the phylogeny inference package PHYLIP version 3.5c (Felsenstein, 1993). The results are shown in Table 2 and indicate that although the majority of distances obtained in each region fit with classification f a certain isolate, only the ranges obtained in the 340bp NS5B-region are non-overlapping and therefore conclusive. However, as was performed in the present invention, it is preferable to obtain sequence information from at least 2 regions before final classification of a given isolate.

[0040] Designation of a number to the different types of HCV and HCV nomenclature is based on chronological discovery of the different types. The numbering system used in the present invention might still fluctuate according to international conventions or guidelines. For example, “type 4” might be changed into “type 5” or “type 6”. Also the arbitrarily chosen border distances between types and subtypes and isolates may still be subject to change according to international guidelines or conventions. Therefore types 7a, 8a, 8b, 9a may for example be designated 6b, 6c, 6d, and 6d in the future; and type 10a which shows relatedness with genotype 3 may be denoted 3 g instead of 10a.

[0041] The term “subtype” corresponds to a group of HCV isolates of which the complete polyprotein shows a homology of more than 90% both at the nucleic acid and amino acid levels, or of which the NS5 region between nucleotide positions 7932 and 8271 shows a homology of more than 90% at the nucleic acid level to the corresponding parts of the genomes of the other isolates of the same group, with said numbering beginning with the adenine residue of the initiation codon of the HCV polyprotein. Isolates belonging to the same type but different subtypes of HCV show homologies of more than 74% at the nucleic acid level and of more than 78% at the amino acid level.

[0042] It is to be understood that extremely variable regions such as the E1, E2 and NS4 regions will exhibit lower homologies than the average homology of the complete genome of the polyprotein.

[0043] Using these criteria, HCV isolates can be classified into at least 11 types. Several subtypes can clearly be distinguished in types 1, 2, 3, 4 and 7: 1a, 1b, 1c, 1d, 1e, 1f, 1g, 2a, 2b, 2c, 2d, 2e, 2f, 2g, 2h, 2i, 2k, 2l, 3a, 3b, 3c, 3d, 3f, 3g, 4a, 4b, 4c, 4d, 4e, 4f, 4g, 4h,4i, 4j, 4k, 4l, 4m, 7a, 7c, and 7d based on homologies of the 5′ UR and coding regions. An overview of most of the reported isolates and their proposed classification according to the typing system of the present invention as well as other proposed classifications is presented in Table 3. TABLE 3 HCV CLASSIFICATION OKA- MOTO MORI CHA NAKAO PROTOTYPE 1a I I Pt GI HCV-1, HCV-H, HC-J1 1b II II KI GII HCV-J, HCV-BK, HCV-T, HC-JK1, HC-J4, HCV-CHINA 1c HC-G9 2a III III K2a GIII HC-J6 2b IV IV K2b GIII HC-J8 2c S83, ARG6, ARG8, I10, T983 2d NE92 3a V V K3 GIV BR36, BR56, HD10, N2L1, BR33, Ta, E-b1 3b VI K3 GIV HCV-TR, Tb, NE137 3c NE48 3d NE274 3e NE145 3f NE125 4a Z4, GB809-4 4b Z1 4c GB116, GB358, GB215, Z6, Z7 4d DK13 4e GB809-2, CAM600, CAM736 4f CAM622, CAM627 4g GB549 4h GB438 4i CAR4/1205 4j CAR1/905 5a GV SA3, SA4, SA1, SA7, SA11, BE95 6a HK1, HK2, HK3, HK4, VN11

[0044] Table 3 Overview of the known HCV types and subtypes classified according to the different authors.

[0045] The term “complement” refers to a nucleotide sequence which is complementary to an indicated sequence and which is able to hybridize to the indicated sequences.

[0046] The composition of the invention can comprise many combinations. By way of example, the composition of the invention can comprise:

[0047] two (or more) nucleic acids from the same region or,

[0048] two nucleic acids (or more), respectively from different regions, for the same isolate or for different isolates,

[0049] or nucleic acids from the same regions and from at least two different regions (for the same isolate or for different isolates).

[0050] The present invention relates particularly to a polynucleic acid as defined above having a sequence selected from any of SEQ ID NO 1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, 67, 69, 71, 73, 75, 77, 79, 81, 83, 85, 87, 89, 91, 93, 95, 97, 99, 101, 103 to 105, or a part of said polynucleic acid which is unique to any of the HCV subtypes or types as defined in Table 5, and which contains at least one nucleotide differing from known HCV polynucleic acids, or the complement thereof.

[0051] The present invention relates more particularly to a polynucleic acid as defined 5 above, which codes for the 5′ UR, the Core/E1, the NS4 or the NS5B region or a part thereof.

[0052] More particularly, the present invention relates to a polynucleic acid as defined above which is a cDNA sequence.

[0053] Also included within the present invention are sequence variants of the polynucleic acids as selected from any of the nucleotide sequences as given in any of the above given SEQ ID numbers with said sequence variants containing either deletion and/or insertions of one or more nucleotides, especially insertions or deletions of 1 or more codons, mainly at the extremities of oligonucleotides (either 3′ or 5′), or substitutions of some non-essential nucleotides (i.e. nucleotides not essential to discriminate between different genotypes of HCV) by others (including modified nucleotides an/or inosine), for example, a type 1 or 2 sequence might be modified into a type 7 sequence by replacing some nucleotides of the type 1 or 2 sequence with type-specific nucleotides of type 7 as shown in for instance FIG. 1 and 2.

[0054] Particularly preferred variant polynucleic acids of the present invention include also sequences which hybridise under stringent conditions with any of the polynucleic acid sequences of the present invention. Particularly, sequences which show a high degree of homology (similarity) to any of the polynucleic acids of the invention as described above. Particularly sequences which are at least 80%, 85%, 90%, 95% or more homologous to said polynucleic acid sequences of the invention. Preferably said sequences will have less than 20%, 15%, 10%, or 5% variation of the original nucleotides of said polynucleic acid sequence.

[0055] Polynucleic acid sequences according to the present invention which are homologous to the sequences as represented by a SEQ ID NO can be characterized and isolated according to any of the techniques known in the art, such as amplification by means of sequence-specific primers, hybridization with sequence-specific probes under more or less stringent conditions, serological screening methods or via the LiPA typing system.

[0056] Other preferred variant polynucleic acids of the present invention include sequences which are redundant as a result of the degeneracy of the genetic code compared any of the above-given polynucleic acids of the present invention. These variant polynucleic acid sequences will thus encode the same amino acid sequence as the polynucleic acids they are derived from.

[0057] Also included within the scope of the present invention are 5′ non-coding region sequences which can be readily obtained from type 1 subtype 1d, 1e, 1f or 1g isolates; type 2 subtype 2e, 2f, 2g, 2h, 2i, 2k or 2l isolates; type 3 subtype 3g isolates; type 4 subtype 4k, 4l or 4m isolates; type 7 subtype 7a, 7c or 7d isolates, type 9, type 10 or type 11 isolates discribed herein. Such sequences may contain type or subtype-specific motifs which can be lo employed for type and/or subtype-specific hybridization assays, e.g. such as described by Stuyver et al. (1993).

[0058] Polynucleic acid sequences of the genomes indicated above from regions not yet depicted in the present examples, figures and sequence listing can be obtained by any of the techniques known in the art, such as amplification techniques using suitable primers from the sequences of these new genomes given in FIG. 1 of the present invention.

[0059] The present invention also relates to an oligonucleotide primer comprising part of a polynucleic acid as defined above, with said primer being able to act as a primer for specifically amplifying the nucleic acid of a certian HCV isolate belonging to the genotype from which the primer is derived.

[0060] The term “primer” refers to a single stranded DNA oligonucleotide sequence capable of acting as a point of initiation for synthesis of a primer extension product which is complementary to the nucleic acid strand to be copied. The length and the sequence of the primer must be such that they allow to prime the synthesis of the extension products. Preferably the primer is about 5-50 nucleotides. Specific length and sequence will depend on the complexity of the required DNA or RNA targets, as well as on the conditions of primer use such as temperature and ionic strength.

[0061] The fact that amplification primers do not have to match exactly with corresponding template sequence to warrant proper amplification is amply documented in the literature (Kwok et al., 1990).

[0062] The amplification method used can be either polymerase chain reaction (PCR; Saiki et al., 1988), ligase chain reaction (LCR; Landgren et al., 1988; Wu & Wallace, 1989; Barany, 1991), nucleic acid sequence-based amplification (NASBA; Guatelli et al., 1990; Compton, 1991), transcription-based amplification system (TAS; Kwoh et al., 1989), strand displacement amplification (SDA; Duck, 1990; Walker et al., 1992) or amplification by means of Qβ replicase (Lizardi et al., 1988; Lomeli et al., 1989) or any other suitable method to amplify nucleic acid molecules using primer extension. During amplification, the amplified products can be conveniently labelled either using labelled primers or by incorporating labelled nucleotides. Labels may be isotopic (³²P, ³⁵S, etc.) or non-isotopic (biotin, digoxigenin, etc.). The amplification reaction is repeated between 20 and 70 times, advantageously between 25 and 45 times.

[0063] The present invention also relates to an oligonucleotide probe comprising part of a polynucleic acid as defined above, with said probe being able to act as a hybridization probe for specific detection and/or classification into types and/or subtypes of an HCV nucleic caid containing said nucleotide sequence, with said probe being possibly labelled or attached to a solid substrate.

[0064] The term “probe” refers to single stranded sequence-specific oligonucleotides which have a sequence which is complementary to the target sequence of the HCV genotype(s) to be detected.

[0065] Preferably, these probes are about 5 to 50 nucleotides long, more preferably from about 10 to 25 nucleotides.

[0066] The term “solid support” can refer to any substrate to which an oligonucleotide probe can be coupled, provided that it retains its hybridization characteristics and provided that the background level of hybridization remains low. Usually the solid substrate will be a microtiter plate, a membrane (e.g. nylon or nitrocellulose) or a microsphere (bead). Prior to application to the membrane or fixation it may be convenient to modify the nucleic acid probe in order to facilitate fixation or improve the hybridization efficiency. Such modifications may encompass homopolymer tailing, coupling with different reactive groups such as aliphatic groups, NH₂ groups, SH groups, carboxylic groups, or coupling with biotin or haptens.

[0067] The present invention also relates to a diagnostic kit for use in determining the genotype of HCV, said kit comprising a primer as defined above.

[0068] The present invention also relates to a diagnostic kit for use in determining the genotype of HCV, said kit comprising a probe as defined above.

[0069] The present invention also relates to a diagnostic kit as defined above, wherein said probe(s) is(are) attached to a solid substrate.

[0070] The present invention also relates to a diagnostic kit as defined above, wherein a range of said probes is attached to specific locations on a solid substrate.

[0071] The present invention also relates to a diagnostic kit as defined above, wherein said solid support is a membrane strip and said probes are coupled to the membrane in the form of parallel lines.

[0072] The present invention also relates to a method for the detection of HCV nucleic acids present in a biological sample, comprising:

[0073] (i) possibly extracting sample nucleic acid,

[0074] (ii) amplifying the nucleic acid with at least one primer as defined above,

[0075] (iii) detecting the amplified nucleic acids.

[0076] The present invention also relates to a method for the detection of HCV nucleic acids present in a biological sample, comprising:

[0077] (i) possibly extracting sample nucleic acid,

[0078] (ii) possibly amplifying the nucleic acid with at least one primer as defined above, or with a universal HCV primer,

[0079] (iii) hybridizing the nucleic acids of the biological sample, possibly under denatured conditions, at appropriate conditions with one or more probes as defined above, with said probes being preferably attached to a solid substrate,

[0080] (iv) possibly washing at appropriate conditions,

[0081] (v) detecting the hybrids formed.

[0082] The present invention also relates to a method for detecting the presence of one or more HCV genotypes present in a biological sample, comprising:

[0083] (i) possibly extracting sample nucleic acid,

[0084] (ii) specifically amplifying the nucleic acid with at least one primer as defined above,

[0085] (iii) detecting said amplified nucleic acids.

[0086] The present invention also relates to a method for detecting the presence of one or more HCV genotypes present in a biological sample, comprising:

[0087] (i) possibly extracting sample nucleic acid,

[0088] (ii) possibly amplifying the nucleic acid with at least one primer as defined above or with a universal HCV primer,

[0089] (iii) hybridizing the nucleic acids of the biological sample, possibly under denatured conditions, at appropriate conditions with one or more probes as defined above, with said probes being preferably attached to a solid substrate,

[0090] (iv) possibly washing at appropriate conditions,

[0091] (v) detecting the hybrids formed,

[0092] (vi) inferring the presence of one or more HCV genotypes present from the observed hybridization pattern.

[0093] The present invention also relates to a method as defined above, wherein said probes are further characterized as defined above.

[0094] The present invention also relates to a method as defined above, wherein said nucleic acids are labelled during or after amplification.

[0095] Preferably, this technique could be performed in the 5′ non-coding, Core or NS5B region.

[0096] The term “nucleic acid” can also be referred to as analyte strand and corresponds to a single- or double-stranded nucleic acid molecule. This analyte strand is preferentially positive- or negative stranded RNA, cDNA or amplified cDNA.

[0097] The term “biological sample” refers to any biological sample (tissue or fluid) containing HCV nucleic acid sequences and refers more particularly to blood serum or plasma samples.

[0098] The term “universal HCV primer” refers to oligonucleotide sequences complementary to any of the conserved regions of the HCV genome.

[0099] The expression “appropriate” hybridization and washing conditions are to be understood as stringent and are generally known in the art (e.g. Maniatis et al., Molecular Cloning: A Laboratory Manual, New York, Cold Spring Harbor Laboratory, 1982).

[0100] However, according to the hybridization solution (SSC, SSPE, etc.), these probes should be hybridized at their appropriate temperature in order to attain sufficient specificity.

[0101] The term “labelled” refers to the use of labelled nucleic acids. This may include the use of labelled nucleotides incorporated during the polymerase step of the amplification such as illustrated by Saiki et al. (1988) or Bej et al. (1990) or labelled primers, or by any other method known to the person skilled in the art.

[0102] The process of the invention comprises the steps of contacting any of the probes as defined above, with one of the following elements:

[0103] either a biological sample in which the nucleic acids are made available for hybridization,

[0104] or the purified nucleic acids contained in the biological sample

[0105] or a single copy derived from the purified nucleic acids,

[0106] or an amplified copy derived from the purified nucleic acids, with said elements or with said probes being attached to a solid substrate.

[0107] The expression “inferring the presence of one or more HCV genotypes present from the observed hybridization pattern” refers to the identification of the presence of HCV genomes in the sample by analyzing the pattern of binding of a panel of oligonucleotide probes. Single probes may provide useful information concerning the presence or absence of HCV genomes in a sample. On the other hand, the variation of the HCV genomes is dispersed in nature, so rarely is any one probe able to identify uniquely a specific HCV genome. Rather, the identity of an HCV genotype may be inferred from the pattern of binding of a panel of oligonucleotide probes, which are specific for (different) segments of the different HCV genomes. Depending on the choice of these oligonucleotide probes, each known HCV genotype will correspond to a specific hybridization pattern upon use of a specific combination of probes. Each HCV genotype will also be able to be discriminated from any other HCV genotype amplified with the same primers depending on the choice of the oligonucleotide probes. Comparison of the generated pattern of positively hybridizing probes for a sample containing one or more unkown HCV sequences to a scheme of expected hybridization patterns, allows one to clearly infer the HCV genotypes present in said sample.

[0108] The present invention thus relates to a method as defined above, wherein one or more hybridization probes are selected from any of SEQ ID NO 1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, 67, 69, 71, 73, 75, 77, 79, 81, 83, 85, 87, 89, 91, 93, 95, 97, 99, 101, 103 or 105 or sequence variants thereof as defined above.

[0109] In order to distinguish the amplified HCV genomes from each other, the target polynucleic acids are hybridized to a set of sequence-specific DNA probes targetting HCV genotypic regions (unique regions) located in the HCV polynucleic acids.

[0110] Most of these probes target the most type- or subtype-specific regions of HCV genotypes, but some can be caused to hybridize to more than one HCV genotype.

[0111] According to the hybridization solution (SSC, SSPE, etc.), these probes should be stringently hybridized at their appropriate temperature in order to attain sufficient specificity. However, by slightly modifying the DNA probes, either by adding or deleting one or a few nucleotides at their extremities (either 3′ or 5′), or substituting some non-essential nucleotides (i.e. nucleotides not essential to discriminate between types) by others (including modified nucleotides or inosine) these probes or variants thereof can be caused to hybridize specifically at the same hybridization conditions (i.e. the same temperature and the same hybridization solution). Also changing the amount (concentration) of probe used may be beneficial to obtain more specific hybridization results. It should be noted in this context, that probes of the same length, regardless of their GC content, will hybridize specifically at approximately the same temperature in TMACI solutions (Jacobs et al., 1988).

[0112] Suitable assay methods for purposes of the present invention to detect hybrids formed between the oligonucleotide probes and the nucleic acid sequences in a sample may comprise any of the assay formats known in the art, such as the conventional dot-blot format, sandwich hybridization or reverse hybridization. For example, the detection can be accomplished using a dot blot format, the unlabelled amplified sample being bound to a membrane, the membrane being incorporated with at least one labelled probe under suitable hybridization and wash conditions, and the presence of bound probe being monitored.

[0113] An alternative and preferred method is a “reverse” dot-blot format, in which the amplified sequence contains a label. In this format, the unlabelled oligonucleotide probes are bound to a solid support and exposed to the labelled sample under appropriate stringent hybridization and subsequent washing conditions. It is to be understood that also any other assay method which relies on the formation of a hybrid between the nucleic acids of the sample and the oligonucleotide probes according to the present invention may be used.

[0114] According to an advantageous embodiment, the process of detecting one or more HCV genotypes contained in a biological sample comprises the steps of contacting amplified HCV nucleic acid copies derived from the biological sample, with oligonucleotide probes which have been immobilized as parallel lines on a solid support.

[0115] According to this advantageous method, the probes are immobilized in a Line Probe Assay (LiPA) format. This is a reverse hybridization format (Saiki et al., 1989) using membrane strips onto which several oligonucleotide probes (including negative or positive control oligonucleotides) can be conveniently applied as parallel lines.

[0116] The invention thus also relates to a solid support, preferably a membrane strip, carrying on its surface, one or more probes as defined above, coupled to the support in the form of parallel lines.

[0117] The LiPA is a very rapid and user-friendly hybridization test. Results can be read after 4 hours. after the start of the amplification. After amplification during which usually a non-isotopic label is incorporated in the amplified product, and alkaline denaturation, the amplified product is contacted with the probes on the membrane and the hybridization is carried out for about 1 to 1,5 h hybridized polynucleic acid is detected. From the hybridization pattern generated, the HCV type can be deduced either visually, but preferably using dedicated software. The LiPA format is completely compatible with commercially available scanning devices, thus rendering automatic interpretation of the results very reliable. All those advantages make the LiPA format liable for the use of HCV detection in a routine setting. The LiPA format should be particularly advantageous for detecting the presence of different HCV genotypes.

[0118] The present invention also relates to a method for detecting and identifying novel HCV genotypes, different from the known HCV genomes, comprising the steps of:

[0119] determining to which HCV genotype the nucleotides present in a biological sample belong, according to the process as defined above,

[0120] in the case of observing a sample which does not generate a hybridization pattern compatible with those defined in Table 3, sequencing the portion of the HCV genome sequence corresponding to the aberrantly hybridizing probe of the new HCV genotype to be determined.

[0121] The present invention also relates to a method for preparing a polynucleic acid according to the present invention. These methods include any method known in the art for preparing polynucleic acids (e.g. the phosphodiester method for synthesizing oligonucleotides as described by Agarwal et al. 1972, Agnew. Chem. Int. Ed. Engl. 11:451, the phosphotriester method of Hsiung et al. 1979, Nucleic Acid Res. 6:1371, or the automated diethylphosphoramidite method of Baeucage et al. 1981, Tetrahedron Letters 22:1859-1862.). Alternatively, the polynucleic acids of the present invention may be isolated fragments of naturally occuring or cloned DNA or RNA. In addition, the oligonucleotides according to the present invention may be synthesized automatically on commercial instruments sold by a variety of manufacturers.

[0122] The present invention particularly also relates to a polypeptide having an amino acid sequence encoded by a polynucleic acid as defined above, or a part thereof which is unique to at least one of the HCV subtypes or types as defined in Table 5, and which contains at least one amino acid differing from any of the known HCV types or subtypes, or an analog thereof being substantially homologous and biologically equivalent.

[0123] The term ‘polypeptide’ refers to a polymer of amino acids and does not refer to a specific length of the product; thus, peptides, oligopeptides, and proteins are included within the definition of polypeptide. This term also does not refer to or exclude post-expression modifications of the polypeptide, for example, glycosylations, acetylations, phosphorylations and the like. Included within the definition are, for example, polypeptides containing one or more analogues of an amino acid (including, for example, unnatural amino acids, PNA, etc.), polypeptides with substituted linkages, as well as other modifications known in the art, both naturally occurring and non-naturally occurring.

[0124] The term “unique” is referred above.

[0125] By “biologically equivalent” as used throughout the specification and claims, it is meant that the compositions are immunogenically equivalent to the proteins (polypeptides) or peptides of the invention as defined above and below.

[0126] By “substantially homologous” as used throughout the ensuing specification and claims to describe proteins and peptides, it is meant a degree of homology in the amino acid sequence to the proteins or peptides of the invention. Preferably the degree of homology is in excess of 90, preferably in excess of 95, with a particularly preferred group of proteins being in excess of 99 homologous with the proteins or peptides of the invention.

[0127] The term “analog” as used throughout the specification or claims to describe the proteins or peptides of the present invention, includes any protein or peptide having an amino acid residue sequence substantially identical to a sequence specifically shown herein in which one or more residues have been conservatively substituted with a biologically equivalent residue. Examples of conservative substitutions include the substitution of one-polar (hydrophobic) residue such as isoleucine, valine, leucine or methionine for another, the substitution of one polar (hydrophillic) residue for another such as between arginine and lysine, between glutamine and asparagine, between glycine and serine, the substitution of one basic residue such as lysine, arginine or histidine for another, or the substitution of one acidic residue, such as aspartic acid or glutamic acid for another. Examples of allowable mutations acccording to the present inevntion can be found in Table 4.

[0128] The phrase “conservative substitution” also includes the use of a chemically derivatized residue in place of a non-derivatized residue provided that the resulting protein or peptide is biologically equivalent to theprotein or peptide of the invention. “Chemical derivative” refers to a protein or peptide having one or more residues chemically derivatized by reaction of a functional side group. Examples of such derivatized molecules, include but are not limited to, those molecules in which free amino groups have been derivatized to form amine hydrochlorides, p-toluene sulfonyl groups, carbobenzoxy groups, t-butyloxycarbonyl groups, chloracetyl groups or formyl groups. Free carboxyl groups may be derivatized to form salts, methyl and ethyl esters or other types of esters or hydrazides. Free hydroxyl groups may be derivatized to form O-acyl or O-alkyl derivatives. The imidazole nitrogen of histidine may be derivatized to form N-imbenzylhistidine. Also included as chemical derivatives are those proteins or peptides which contain one or more naturally-occurring amino acid derivatives of the twenty standard amino acids. For examples: 4-hydroxyproline may be substituted for proline; 5-hydroxylysine may be substituted for lysine; 3-methylhistidine may be substituted for histidine; homoserine may be substituted for serine; and ornithine may be substituted for lysine. The proteins or peptides of the present invention also include any protein or peptide having one or more additions and/or deletions or residues relative to the sequence of a peptide whose sequence is shown herein, so long as the peptide is biologically equivalent to the proteins or peptides of the invention.

[0129] It is to be noted that, at the level of the amino acid sequence, at least one amino acids difference (with respect to known HCV amino acid sequences) is sufficient to be part of the invention, which means that the polypeptides of the invention correspond to polynucleic acids having at least one nucleotide difference (with known HCV polynucleic acid sequences) involving an amino acid difference in the encoded polyprotein.

[0130] As the NS4 and the Core regions are known to contain several epitopes, for example characterized in patent application EP-A-0 489 968, and as the E1 protein is expected to be subject to immune attack as part of the viral envelope and expected to contain epitopes, the NS4, Core and E1 epitopes of the new types and subtypes disclosed herein will consistently differ from the epitopes present in previously known genotypes. This is examplified by the type-specificity of NS4 synthetic peptides as described in Simmonds et al. (1993c) and Stuyver et al. (1993b) and PCT/EP 94/01323 and the type-specificity of recombinant E1 proteins as described in Maertens et al. (1994).

[0131] The peptides according to the present invention contain preferably at least 3, preferably 4, 5 contiguous HCV amino acids, 6, 7 preferably however at least 8 contiguous HCV amino acids, at least 10 or at least 15 (for instance at least 9, 10, 11, 12, 13, 14, 15, 16, 17, 19, 20, 21, 22, 23, 24, 25, 30, 35, 40, 45, 50 or more amino acids). TABLE 4 Amino acids Synonymous groups Ser (S) Ser, Thr, Gly, Asn Arg (R) Arg, His, Lys, Glu, Gln Leu (L) Leu; Ile, Met, Phe, Val, Tyr Pro (P) Pro, Ala, Thr, Gly Thr (T) Thr, Pro, Ser, Ala, Gly, His, Gln Ala (A) Ala, Pro, Gly, Thr Val (V) Val, Met, Ile, Tyr, Phe, Leu, Val Gly (G) Gly, Ala, Thr, Pro, Ser Ile (I) Ile, Met, Leu, Phe, Val, Ile, Tyr Phe (F) Phe, Met, Tyr, Ile, Leu, Trp, Val Tyr (Y) Tyr, Phe, Trp, Met, Ile, Val, Leu Cys (C) Cys, Ser, Thr, Met His (H) His, Gln, Arg, Lys, Glu, Thr Gln (Q) Gln, Glu, His, Lys, Asn, Thr, Arg Asn (N) Asn, Asp, Ser, Gln Lys (K) Lys, Arg, Glu, Gln, His Asp (D) Asp, Asn, Glu, Gln Glu (E) Glu, Gln, Asp, Lys, Asn, His, Arg Met (M) Met, Ile, Leu, Phe, Val

[0132] Table 4 Overview of the amino acid substitutions which could form the basis of analogs (muteins) as defined above

[0133] The polypeptides of the invention, and particularly the fragments, can be prepared by classical chemical synthesis.

[0134] The synthesis can be carried out in homogeneous solution or in solid phase.

[0135] For instance, the synthesis technique in homogeneous solution which can be used is the one described by Houbenweyl in the book entitled “Methode der organischen chemie” (Method of organic chemistry) edited by E. Wunsh, vol. 15-I et II. THIEME, Stuttgart 1974.

[0136] The polypeptides of the invention can also be prepared in solid phase according to the methods described by Atherton and Shepard in their book entitled “Solid phase peptide synthesis” (IRL Press, Oxford, 1989).

[0137] The polypeptides according to this invention can be prepared by means of recombinant DNA techniques as described by Maniatis et al., Molecular Cloning: A Laboratory Manual, New York, Cold Spring Harbor Laboratory, 1982).

[0138] The present invention relates particularly to a polypeptide as defined above, comprising in its amino acid sequence at least one of the following amino acid residues: I15, C38, V44, A49, Q43, P49, Q55, A58, S60 or D60, E68 or V68, H70, A71 or Q71 or N71, D72, H81, H101, D106, S110, L130, I134, E135, L140, S148, T150 or E150, Q153, F155, D157, G160, E165, I169, F181, L186, T190, T192 or I192 or H192, I193, A195, S196, R197 or N197 or K197, Q199 or D199 or H199, N199, F200 or T200, A208, I213, M216 or S216, N217 or S217 or G217 or K217, T218, I219, A222, Y223, I230, W231 or L231, S232 or H232 or A232, Q233, E235 or L235, F236 or T236, F237, L240 or M240, A242, N244, N249, I250 or K250 or R250, A252 or C252, A254, I255 or V255, D256 or M256, E257, E260 or K260, R261, V268, S272 or R272, I285, G290 or F290, A291, A293 or L293 or W293, T294 or A294, S295, H295, K296 or E296, Y297 or M297, I299 or Y299, I300, S301, P316, S2646, A2648, G2649, A2650, V2652, Q2653, H2656 or L2656, D2657, F2659, K2663 or Q2663, A2667 or V2667, D2677, L2681, M2686 or Q2686 or E2686, A2692 or K2692, H2697, I2707, L2708 or Y2708, A2709, A2719 or M2719, F2727, T2728 or D2728, E2729, F2730 or Y2730, I2741, I2745, V2746 or E2746 or L2746 or K2746, A2748, S2749 or P2749, R2750, E2751, D2752 or N2752 or S2752 or T2752 or V2752 or I2752 or Q2752, S2753 or D2753 or G2753, D2754, A2755, L2756 or Q2756, or R2757,

[0139] with said notation being composed of a letter representing the amino acid residue by its one-letter code, and a number representing the amino acid numbering according to Kato et al., 1990 as shown in Table 1 (see also the numbering in FIGS. 2, 4 and 6), or a part thereof which is unique to at least one of the HCV subtypes or types as defined in Table 5, and which contains at least one amino acid differing from any of the known HCV types or subtypes, or an analog thereof being substantially homologous and biologically equivalent to said polypeptide or part thereof.

[0140] These unique amino acid residues can be deduced from aligning the new HCV amino acid sequences as given in FIG. 3 to all known HCV sequences. An alignment with the new sequences as represented in SEQ ID NO 1 to 106 is given in for instance FIGS. 2, 4 and 6. It should be clear that the alignments given in these figures may be completed with all known HCV sequences to illustrate that any of the above-given unique residues is indeed unique for at least one of the new HCV sequences of the present invention.

[0141] Within the group of unique and new amino acid residues of the present invention, unique residues may be found which are specific for the following new types (subtypes) of HCV according to the HCV classification system used in the present invention: type 1 subtype 1d, 1e, 1f or 1g isolates; type 2 subtype 2e, 2f, 2g, 2h, 2i, 2k or 2l isolates; type 3 subtype 3g isolates; type 4 subtype 4k, 4l or 4m isolates; type 7 subtype 7a, 7c or 7d isolates, type 9, type 10 or type 11 isolates. In order to obtain these residues the alignments given in FIGS. 2, 4 and 6 may be used to deduce the type- and or subtype-specificity of any of the unique residues given above.

[0142] For example T190 (detected in subtype 1d) refers to a threomine at position 190 (see FIG. 2). In other sequences only a serine (S90) or exceptionally an alanine (A190 in type 10a) can be detected.

[0143] The polypeptides according to this embodiment of the invention may be possibly labelled, or attached to a solid substrate, or coupled to a carrier molecule such as biotin, or mixed with a proper adjuvant all known in the art and according to the intended use (diagnostic, therapeutic or prophylactic).

[0144] The present invention also relates to a polypeptide as defined above, comprising in its amino acid sequence at least one of the sequences repesented by SEQ ID NO107 to 207 as listed above, or a part thereof which is unique to at least one of the HCV subtypes or types as defined in Table 5, or an analog thereof being substantially homologous and biologically equivalent to said polypeptide or part thereof.

[0145] The present invention relates also to a polypeptide having an amino acid sequence as represented in any of SEQ ID NO 1 to 106, or a part thereof which is unique to at least one of the HCV subtypes or types as defined in Table 5, or an analog thereof being substantially homologous and biologically equivalent to said polypeptide or part thereof.

[0146] The variable region in the core protein (V-CORE in FIG. 2) has been shown to be useful for serotyping (Machida et al., 1992). The sequence of the type 1 subtype 1d, `e, 1f or 1g sequence; type 2 subtype 2e, 2f, 2g, 2h, 2i, 2k and 2l sequence; type 3 subtype 3g; type 4, subtype 4k, 4l or 4m sequence; type 7 (subtype 7a, 7c and 7d sequences), 9, 10 or 11 sequences of the present invention show type-specific features in this region. The peptide from amino acid 68 to 78 (V-core region) shows the following unique sequence for the sequences of the present invention (see FIG. 2): ARQSDGRSWAQ or ARRSEGRSWAQ as for subtype 1d (SEQ ID NO 107 and 108) ERRPEGRSWAQ as for subtype 1e (SEQ ID NO 109) ARRPEGRSWAQ as for subtype 1f (SEQ ID NO 110) DRRTTGKSWGR as for subtype 2k (SEQ ID NO 111) DRRATGRSWGR as for subtype 2e (SEQ ID NO 112) DRRATGKSWGR as for subtype 2f (SEQ ID NO 113) VRQPTGRSWGQ as for type 9 (SEQ ID NO 114) VRHQTGRTWAQ as for subtype 7a and 7c (SEQ ID NO 115) VRQNQGRTWAQ as for subtype 7d (SEQ ID NO 116) ARRTEGRSWAQ as for type 10 (SEQ ID NO 117) VRRTTGRXXXX or VRRTTGRTWAQ as for type 11 (SEQ ID NO 118 and 119)

[0147] Five type-specific variable regions (V1 to V5) can be identified after aligning E1 amino acid sequences of the genotypes of the present invention to the genotypes already known, as shown in FIG. 2. Region V1 encompasses amino acids 192 to 203, this is the amino-terminal 10 amino acis of the E1 protein. The following unique sequences as shown in FIG. 2 can be deduced: HEVRNASGVYHV or HEVRNASGVYHL as for subtype 1d, (SEQ ID NO 120 and 121) YEVHSTTDGYHV as for subtype 1f (SEQ ID NO 122) VEVKNTSQAYMA as for subtype 2e (SEQ ID NO 123) IQVKNNSHFYMA as for subtype 2f (SEQ ID NO 124) VQVKNTSTMYMA as for subtype 2g (SEQ ID NO 125) VQVKNTSHSYMV as for subtype 2h (SEQ ID NO 126) VQVANRSGSYMV as for subtype 2i (SEQ ID NO 127) VEIKNTXNTYVL or VEIKNTSNTYVL as for subtype 2k (SEQ ID NO 128 and 129) INYRNVSGIYYV or INYRNTSGIYHV or INYHNTSGIYHI or TNYRNVSGIYHV as for subtype 4k (SEQ ID NO 130, 131, 132 or 133) QHYRNVSGIYHV as for subtype 4l (SEQ ID NO 134) IQVKNASGIYHL as for type 9 (SEQ ID NO 135) AHYTNKSGLYHL as for subtype 7c (SEQ ID NO 136) LNYANKSGLYHL as for subtype 7d (SEQ ID NO 137) LEYRNASGLYMV as for type 10 (SEQ ID NO 138) Region V2 encompasses amino acids 213 to 223. The following unique sequences can be found in the V2 region as shown in FIG. 2: IYEMDGMIMHY or IYEMSGMILHA as for subtype 1d, (SEQ ID NO 139 and 140) VYEAKDIILHT as for subtype 1f (SEQ ID NO 141) VWQLXDAVLHV as for subtype 2e (SEQ ID NO 142) VWQLRDAVLHV as for subtype 2f (SEQ ID NO 143) IWQMQGAVLHV as for subtupe 2g (SEQ ID NO 144) VWQLKDAVLHV as for subtype 2h (SEQ ID NO 145) VWQLEEAVLHV as for subtype 2i (SEQ ID NO 146) TWQLXXAVLHV as for subtype 2k (SEQ ID NO 147) VYEADHHILHL or VYEADHHILAL or VFEADHHILHL as for subtype 4k (SEQ ID NO 148, 149 and 150) VYESDHHILHL as for subtype 4l (SEQ ID NO 151) VFEAETMILHL as for type 9 (SEQ ID NO 152) VYEAETLILHL as for subtype 7c (SEQ ID NO 153) VYEANGMILHL as for subtype 7d (SEQ ID NO 154) VYEAGDIILHL as for type 10. (SEQ ID NO 155) Region V3 encompasses the amino acids 230 to 242. The following unique V3 region sequences can be deduced from FIG. 2: VREDNHLRCWMAL or VRENNSSRCWMAL as for subtype 1d (SEQ ID NO 156 and 157) IREGNISRCWVLP as for subtype 1f (SEQ ID NO 158) ENSSGRFHCWIPI as for subtype 2e (SEQ ID NO 159) ERSGNRTFCWTAV as for subtype 2f (SEQ ID NO 160) ELQGNKSRCWIPV as for subtype 2g (SEQ ID NO 162) ERHQNQSRCWIPV as for subtype 2h (SEQ ID NO 163) EWKDNTSRCWIPV as for subtype 2i (SEQ ID NO 164) EREGNSSRCWIPV as for subtype 2k (SEQ ID NO 165) VREGNQSRCWVAL or VRTGNQSRCWVAL or VRVGNQSSCWVAL or VRVGNQSRCWVAL or VKEGNHSRCWVAL as (SEQ ID NO 166, for subtype 4k 167, 168 or 169) VKTGNTSRCWVAL as for subtype 4l (SEQ ID NO 170) IKAGNESRCWLPV as for type 9 (SEQ ID NO 171) VKXXNQSRCWVQA as for subtype 7c (SEQ ID NO 172) VKTGNLTKCWLSA as for subtype 7d (SEQ ID NO 173) VRSGNTSRCWIPV as for type 10 (SEQ ID NO 174) Region V4 encompasses the amino acids 248 to 257. The following unique V4 region sequences can be deduced from FIG. 2: VKNASVPTAA or VKDANVPTAA as for subtype 1d (SEQ ID NO 175 and 176) ARIANAPIDE as for subtype 1f (SEQ ID NO 177) VSKPGALTKG as for subtype 2e (SEQ ID NO 178) VSRPGALTRG as for subtype 2f (SEQ ID NO 179) VNQPGALTRG as for subtype 2g (SEQ ID NO 180) VSQPGALTRG as for subtype 2h (SEQ ID NO 181) VSQPGALTKG as for subtype 2i (SEQ ID NO 182) VSRPGALTEG as for subtype 2k (SEQ ID NO 183) APYIGAPLES or APYTAAPLES as for subtype 4k (SEQ ID NO 184 and 185) APILSAPLMS as for subtype 4l (SEQ ID NO 186) VPNSSVPIHG as for type 9 (SEQ ID NO 187) VPNASTPVTG as for subtype 7c (SEQ ID NO 188) VQNASVSIRG as for subtype 7d (SEQ ID NO 189) VKSPCAATAS as for type 10 (SEQ ID NO 190) Region V5 encompasses the amino acids 294 to 303. The following unique V5 region peptides can be deduced from FIG. 2: SPRMHHTTQE or SPRLYHTTQE as for subtype 1d (SEQ ID NO 191 and 192) TSRRHWTVQD as for subtype 1f (SEQ ID NO 193) APKRHYFVQE as for subtype 2e (SEQ ID NO 194) SPQYHTFVQE as for subtype 2f (SEQ ID NO 195) SPQHHNFSQD as for subtype 2g (SEQ ID NO 196) SPQHHIFVQD as for subtype 2h (SEQ ID NO 197) SPEHHHFVQD as for subtype 2k (SEQ ID NO 198) RPRRHWTTQD or RPRRHWTAQD or QPRRHWTTQD or RPRRHWTTQE as for subtype 4k (SEQ ID NO 199, 200, 201 or 202) QPRRHWTVQD as for subtype 4l (SEQ ID NO 203) RPKYHQVTQD as for type 9 (SEQ ID NO 204) RPRMHQVVQE as for subtype 7c (SEQ ID NO 205) RPRMYEIAQD as for subtype 7d (SEQ ID NO 206) RHRQHWTVQD as for tvoe 10 (SEQ ID NO 207)

[0148] The above given list of peptides are particularly useful for treatment and vaccine and diagnostic development.

[0149] Also comprised in the present invention is any synthetic peptide (see below) or polypeptide containing at least an epitope derived from the above-defined peptides in their peptidic chain. Also comprised within the present invention is any synthetic peptide or polypeptide comprising at least 6, 7, 8, or 9 contiguous amino acids derived from the above-defined peptides in their peptidic chain.

[0150] As used herein, ‘epitope’ or ‘antigenic determinant’ means an amino acid sequence that is immunoreactive. Generally an epitope consists of at least 3 to 4 amino acids, and more usually, consists of at least 5 or 6 amino acids, sometimes the epitope consists of about 7 to 8, or even about 10 amino acids.

[0151] The present invention particularly relates to any peptide (see below) or polypeptide contained in any of the amino acid sequences as represented in SEQ ID NO 2, 4, 7, 9, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98, 100, 102, 104 or 106 (see Table 5 and FIG. 3, Examples section).

[0152] The present invention also relates to a recombinant polypeptide encoded by a polynucleic acid as defined above, or a part thereof which is unique to any of the HCV subtypes or types as defined in Table 5, or an analog thereof being substantially homologous and biologically equivalent to said polypeptide.

[0153] The present invention also relates to a recombinant expression vector comprising a polynucleic acid or a part thereof as defined above, operably linked to prokaryotic, eukaryotic or viral transcription and translation control elements.

[0154] In general said recombinant vector will comprise a vector sequence, an appropriate prokaryotic, eukaryotic or viral promoter sequence followed by the nucleotide sequences as defined above, with said recombinant vector allowing the expression of any one of the polypeptides as defined above in a prokaryotic, or eukaryotic host or in living mammals when injected as naked DNA, and more particularly a recombinant vector allowing the expression of any of the new HCV sequences of the invention spanning particularly the following amino acid positions:

[0155] a polypeptide starting in the region between positions I and 10 and ending at any position in the region between positions 70 and 420, more particularly a polypeptide spanning positions 1 to 70, 1 to 85, positions 1 to 120, positions 1 to 150, positions 1 to 191, or positions 1 to 200, for expression of the Core protein, and a polypeptide spanning positions 1 to 263, positions 1 to 326, positions 1 to 383, or positions 1 to 420 for expression of the Core and E1 protein;

[0156] a polypeptide starting at any position in the region between positions 117 and 192, and ending at any position in the region between positions 263 and 420, for expression of E1, or forms that have the hydrophobic region deleted (positions 264 to 293 plus or minus 8 amino acids);

[0157] a polypeptide starting at any position in the region between positions 1556 and 1688, and ending at any position in the region between positions 1739 and 1764, for expression of NS4, more particularly ;a polypeptide starting at position 1658 and ending at position 1711, for expression of NS4a antigen, and more particularly, a polypeptide starting at position 1712 and ending in the region between positions 1743 and 1972 (for instance 1712-1743, 1712-1764, 1712-1782, 1712-1972, 1712-1782, 1712-1902), for expression of NS4b antigen or parts thereof.

[0158] Any other HCV vector construction known in the art may also be used for the recombinant polypeptides of the present invention.

[0159] Also any of the known purification methods for recombinant proteins may be used for the production of the recombinant polypeptides of the present invention, particularly the HCV recombinant polypeptide purification methods as disclosed in PCT/EP 95/03031 in name of Innogenetics N.V.

[0160] The term “vector” may comprise a plasmid, a cosmid, a phage, or a virus or a transgenic animal. Particularly useful for vaccine development may be BCG or adenoviral vectors, as well as avipox recombinant viruses.

[0161] The present invention also relates to a method for the production of a recombinant polypeptide as defined above, comprising:

[0162] transformation of an appropriate cellular host with a recombinant vector, in which a polynucleic acid or a part thereof according to as defined above has been inserted under the control of appropriate regulatory elements,

[0163] culturing said transformed cellular host under conditions enabling the expression of said insert, and,

[0164] harvesting said polypeptide.

[0165] The term ‘recombinantly expressed’ used within the context of the present invention refers to the fact that the proteins of the present invention are produced by recombinant expression methods be it in prokaryotes, or lower or higher eukaryotes as discussed in detail below.

[0166] The term ‘lower eukaryote’ refers to host cells such as yeast, fungi and the like. Lower eukaryotes are generally (but not necessarily) unicellular. Preferred lower eukaryotes are yeasts, particularly species within Saccharomyces, Schizosaccharomyces, Kluveromyces, Pichia (e.g. Pichia pastoris), Hansenula (e.g. Hansenula polymorpha), Yarowia, Schwaniomyces, Schizosaccharomyces, Zygosaccharomyces and the like. Saccharomyces cerevisiae, S. carlsbergensis and K. lactis are the most commonly used yeast hosts, and are convenient fungal hosts.

[0167] The term ‘prokaryotes’ refers to hosts such as E.coli, Lactobacillus, Lactococcus, Salmonella, Streptococcus, Bacillus subtilis or Streptomyces. Also these hosts are contemplated within the present invention.

[0168] The term ‘higher eukaryote’ refers to host cells derived from higher animals, such as mammals, reptiles, insects, and the like. Presently preferred higher eukaryote host cells are derived from Chinese hamster (e.g. CHO), monkey (e.g. COS and Vero cells), baby hamster kidney (BHK), pig kidney (PK15), rabbit kidney 13 cells (RK13), the human osteosarcoma cell line 143 B, the human cell line HeLa and human hepatoma cell lines like Hep G2, and insect cell lines (e.g. Spodoptera frugiperda). The host cells may be provided in suspension or flask cultures, tissue cultures, organ cultures and the like. Alternatively the host cells may also be transgenic animals.

[0169] The term ‘recombinant polynucleotide or nucleic acid’ intends a polynucleotide or nucleic acid of genomic, cDNA, semisynthetic, or synthetic origin which, by virtue of its origin or manipulation: (1) is not associated with all or a portion of a polynucleotide with which it is associated in nature, (2) is linked to a polynucleotide other than that to which it is linked in nature, or (3) does not occur in nature.

[0170] The term ‘recombinant host cells’, ‘host cells’, ‘cells’, ‘cell lines’, ‘cell cultures’, and other such terms denoting microorganisms or higher eukaryotic cell lines cultured as unicellular entities refer to cells which can be or have been, used as recipients for a recombinant vector or other transfer polynucleotide, and include the progeny of the original cell which has been transfected. It is understood that the progeny of a single parental cell may not necessarily be completely identical in morphology or in genomic or total DNA complement as the original parent, due to natural, accidental, or deliberate mutation.

[0171] The term ‘replicon’ is any genetic element, e.g., a plasmid, a chromosome, a virus, a cosmid, etc., that behaves as an autonomous unit of polynucleotide replication within a cell; i.e., capable of replication under its own control.

[0172] The term ‘vector’ is a replicon further comprising sequences providing replication and/or expression of a desired open reading frame.

[0173] The term ‘control sequence’ refers to polynucleotide sequences which are necessary to effect the expression of coding sequences to which they are ligated. The nature of such control sequences differs depending upon the host organism; in prokaryotes, such control sequences generally include promoter, ribosomal binding site, splicing sites and terminators; in eukaryotes, generally, such control sequences include promoters, splicing sites, terminators and, in some instances, enhancers. The term ‘control sequences’ is intended to include, at a minimum, all components whose presence is necessary for expression, and may also include additional components whose presence is advantageous, for example, leader sequences which govern secretion.

[0174] The term ‘promoter’ is a nucleotide sequence which is comprised of consensus sequences which allow the binding of RNA polymerase to the DNA template in a manner such that mRNA production initiates at the normal transcription initiation site for the adjacent structural gene.

[0175] The expression ‘operably linked’ refers to a juxtaposition wherein the components so described are in a relationship permitting them to function in their intended manner. A control sequence ‘operably linked’ to a coding sequence is ligated in such a way that expression of the coding sequence is achieved under conditions compatible with the control sequences.

[0176] The segment of the HCV cDNA encoding the desired sequence inserted into the vector sequence may be attached to a signal sequence. Said signal sequence may be that from a non-HCV source, e.g. the IgG or tissue plasminogen activator (tpa) leader sequence for expression in mammalian cells, or the á-mating factor sequence for expression into yeast cells, but particularly preferred constructs according to the present invention contain signal sequences appearing in the HCV genome before the respective start points of the proteins.

[0177] A variety of vectors may be used to obtain recombinant expression of HCV single or specific oligomeric envelope proteins of the present invention. Lower eukaryotes such as yeasts and glycosylation mutant strains are typically transformed with plasmids, or are transformed with a recombinant virus. The vectors may replicate within the host independently, or may integrate into the host cell genome.

[0178] Higher eukaryotes may be transformed with vectors, or may be infected with a recombinant virus, for example a recombinant vaccinia virus. Techniques and vectors for the insertion of foreign DNA into vaccinia virus are well known in the art, and utilize, for example homologous recombination. A wide variety of viral promoter sequences, possibly terminator sequences and poly(A)-addition sequences, possibly enhancer sequences and possibly amplification sequences, all required for the mammalian expression, are available in the art. Vaccinia is particularly preferred since vaccinia halts the expression of host cell proteins. Vaccinia is also very much preferred since it allows the expression of f.i. E1 and E2 proteins of HCV in cells or individuals which are immunized with the live recombinant vaccinia virus. For vaccination of humans the avipox and Ankara Modified Virus (AMV) are particularly useful vectors.

[0179] Also known are insect expression transfer vectors derived from baculovirus Autographa californica nuclear polyhedrosis virus (AcNPV), which is a helper-independent viral expression vector. Expression vectors derived from this system usually use the strong viral polyhedrin gene promoter to drive the expression of heterologous genes. Different vectors as well as methods for the introduction of heterologous DNA into the desired site of baculovirus are available to the man skilled in the art for baculovirus expression. Also different signals for posttranslational modification recognized by insect cells are known in the art.

[0180] The present invention also relates to a host cell transformed with a recombinant vector as defined above.

[0181] The present invention also relates to a method for detecting antibodies to HCV present in a biological sample, comprising:

[0182] (i) contacting the biological sample to be analysed for the presence of HCV with a polypeptide as defined above,

[0183] (ii) detecting the immunological complex formed between said antibodies and said polypeptide.

[0184] The present invention also relates to a method for HCV typing, comprising:

[0185] (i) contacting the biological sample to be analysed for the presence of HCV with a polypeptide as defined above,

[0186] (ii) detecting the immunological complex formed between said antibodies and said polypeptide.

[0187] The present invention also relates to a diagnostic kit for use in detecting the presence of HCV, said kit comprising at least one polypeptide as defined above, with said polypeptide being preferably bound to a solid support.

[0188] The present invention also relates to a diagnostic kit for HCV typing, said kit comprising at least one polypeptide as defined above, with said polypeptide being preferably bound to a solid support.

[0189] The present invention also relates to diagnostic kit according as defined above, said kit comprising a range of said polypeptides which are attached to specific locations on a solid substrate.

[0190] The present invention also relates to a diagnostic kit as defined above, wherein said solid support is a membrane strip and said polypeptides are coupled to the membrane in the form of parallel lines.

[0191] The immunoassay methods according to the present invention may utilize antigens from the different domains of the new and unique polypeptide sequences of the present invention that maintain linear (in case of peptides) and conformational epitopes (in case of polypeptides) recognized by antibodies in the sera from individuals infected with HCV. It is within the scope of the invention to use for instance single or specific oligomeric antigens, dimeric antigens, as well as combinations of single or specific oligomeric antigens. The HCVantigens of the present invention may be employed in virtually any assay format that employs a known antigen to detect antibodies. Of course, a format that denatures the HCV conformational epitope should be avoided or adapted. A common feature of all of these assays is that the antigen is contacted with the body component suspected of containing HCV antibodies under conditions that permit the antigen to bind to any such antibody present in the component. Such conditions will typically be physiologic temperature, pH and ionic strenght using an excess of antigen. The incubation of the antigen with the specimen is followed by detection of immune complexes comprised of the antigen.

[0192] Design of the immunoassays is subject to a great deal of variation, and many formats are known in the art. Protocols may, for example, use solid supports, or immunoprecipitation. Most assays involve the use of labeled antibody or polypeptide; the labels may be, for example, enzymatic, fluorescent, chemiluminescent, radioactive, or dye molecules. Assays which amplify the signals from the immune complex are also known; examples of which are assays which utilize biotin and avidin or streptavidin, and enzyme-labeled and mediated immunoassays, such as ELISA assays.

[0193] The immunoassay may be, without limitation, in a heterogeneous or in a homogeneous format, and of a standard or competitive type. In a heterogeneous format, the polypeptide is typically bound to a solid matrix or support to facilitate separation of the sample from the polypeptide after incubation. Examples of solid supports that can be used are nitrocellulose (e.g., in membrane or microtiter well form), polyvinyl chloride (e.g., in sheets or microtiter wells), polystyrene latex (e.g., in beads or microtiter plates, polyvinylidine fluoride (known as Immunolon™), diazotized paper, nylon membranes, activated beads, and Protein A beads. For example, Dynatech Immunolon™ 1 or Immunlon™ 2 microtiter plates or 0.25 inch polystyrene beads (Precision Plastic Ball) can be used in the heterogeneous format. The solid support containing the antigenic polypeptides is typically washed after separating it from the test sample, and prior to detection of bound antibodies. Both standard and competitive formats are know in the art.

[0194] In a homogeneous format, the test sample is incubated with the combination of antigens in solution. For example, it may be under conditions that will precipitate any antigen-antibody complexes which are formed. Both standard and competitive formats for these assays are known in the art.

[0195] In a standard format, the amount of HCV antibodies in the antibody-antigen complexes is directly monitored. This may be accomplished by determining whether labeled anti-xenogeneic (e.g. anti-human) antibodies which recognize an epitope on anti-HCV antibodies will bind due to complex formation. In a competitive format, the amount of HCV antibodies in the sample is deduced by monitoring the competitive effect on the binding of a known amount of labeled antibody (or other competing ligand) in the complex.

[0196] Complexes formed comprising anti-HCV antibody (or in the case of competitive assays, the amount of competing antibody) are detected by any of a number of known techniques, depending on the format. For example, unlabeled HCV antibodies in the complex may be detected using a conjugate of anti-xenogeneic Ig complexed with a label (e.g. an enzyme label).

[0197] In an immunoprecipitation or agglutination assay format the reaction between the HCV antigens and the antibody forms a network that precipitates from the solution or suspension and forms a visible layer or film of precipitate. If no anti-HCV antibody is present in the test specimen, no visible precipitate is formed.

[0198] There currently exist three specific types of particle agglutination (PA) assays. These assays are used for the detection of antibodies to various antigens when coated to a support. One type of this assay is the hemagglutination assay using red blood cells (RBCs) that are sensitized by passively adsorbing antigen (or antibody) to the RBC. The addition of specific antigen antibodies present in the body component, if any, causes the RBCs coated with the purified antigen to agglutinate.

[0199] To eliminate potential non-specific reactions in the hemagglutination assay, two artificial carriers may be used instead of RBC in the PA. The most common of these are latex particles. However, gelatin particles may also be used. The assays utilizing either of these carriers are based on passive agglutination of the particles coated with purified antigens.

[0200] The HCV antigens of the present invention comprised of conformational epitopes will typically be packaged in the form of a kit for use in these immunoassays. The kit will normally contain in separate containers the native HCV antigen, control antibody formulations (positive and/or negative), labeled antibody when the assay format requires the same and signal generating reagents (e.g. enzyme substrate) if the label does not generate a signal directly. The native HCV antigen may be already bound to a solid matrix or separate with reagents for binding it to the matrix. Instructions (e.g. written, tape, CD-ROM, etc.) for carrying out the assay usually will be included in the kit.

[0201] Immunoassays that utilize the native HCV antigen are useful in screening blood for the preparation of a supply from which potentially infective HCV is lacking. The method for the preparation of the blood supply comprises the following steps. Reacting a body component, preferably blood or a blood component, from the individual donating blood with HCV polypeptides of the present invention to allow an immunological reaction between HCV antibodies, if any, and the HCV antigen. Detecting whether anti-HCV antibody—HCV antigen complexes are formed as a result of the reacting. Blood contributed to the blood supply is from donors that do not exhibit antibodies to the native HCV antigens.

[0202] In cases of a positive reactivity to the HCV antigen, it is preferable to repeat the immunoassay to lessen the possibility of false positives. For example, in the large scale screening of blood for the production of blood products (e.g. blood transfusion, plasma, Factor VII, immunoglobulin, etc.) ‘screening’ tests are typically formatted to increase sensitivity (to insure no contaminated blood passes) at the expense of specificity; i.e. the false-positive rate is increased. Thus, it is typical to only defer for further testing those donors who are ‘repeatedly reactive’; i.e. positive in two or more runs of the immunoassay on the donated sample. However, for confirmation of HCV-positivity, the ‘confirmation’ tests are typically formatted to increase specificity (to insure that no false-positive samples are confirmed) at the expense of sensitivity.

[0203] The solid phase selected can include polymeric or glass beads, nitrocellulose, microparticles, microwells of a reaction tray, test tubes and magnetic beads. The signal generating compound can include an enzyme, a luminescent compound, a chromogen, a radioactive element and a chemiluminescent compound. Examples of enzymes include alkaline phosphatase, horseradish peroxidase and beta-galactosidase. Examples of enhancer compounds include biotin, anti-biotin and avidin. Examples of enhancer compounds binding members include biotin, anti-biotin and avidin. In order to block the effects of rheumatoid factor-like substances, the test sample is subjected to conditions sufficient to block the effect of rheumatoid factor-like substances. These conditions comprise contacting the test sample with a quantity of anti-human IgG to form a mixture, and incubating the mixture for a time and under conditions sufficient to form a reaction mixture product substantially free of rheumatoid factor-like substance.

[0204] The present invention particularly relates to an immunoassay format in which the polypeptides (or peptides) of the invention are coupled to a membrane in the form of parallel lines. This assay format is particularly advantageous for HCV typing purposes.

[0205] The present invention also relates to a pharmaceutical composition comprising at least one (recombinant) polypeptides as defined above and a suitable excipient, diluent or carrier.

[0206] The present invention also relates to a method of preventing HCV infection, comprising administering the pharmaceutical composition as defined above to a mammal in effective amount to stimulate the production of protective antibody or protective T-cell response.

[0207] The present invention relates to the use of a composition as defined above in a method for preventing HCV infection.

[0208] The present invention further relates to a vaccine for immunizing a mammal against HCV infection, comprising at least one (recombinant) polypeptide as defined above, in a pharmaceutically acceptable carrier.

[0209] The term ‘immunogenic’ refers to the ability of a substance to cause a humoral and/or cellular response, whether alone or when linked to a carrier, in the presence or absence of an adjuvant. ‘Neutralization’ refers to an immune response that blocks the infectivity, either partially or fully, of an infectious agent. A ‘vaccine’ is an immunogenic composition capable of eliciting protection against HCV, whether partial or complete. A vaccine may also be useful for treatment of an individual, in which case it is called a therapeutic vaccine.

[0210] The term ‘therapeutic’ refers to a composition capable of treating HCV infection.

[0211] The term ‘effective amount’ refers to an amount of epitope-bearing polypeptide sufficient to induce an immunogenic response in the individual to which it is administered, or to otherwise detectably immunoreact in its intended system (e.g., immunoassay). Preferably, the effective amount is sufficient to effect treatment, as defined above. The exact amount necessary will vary according to the application. For vaccine applications or for the generation of polyclonal antiserum/antibodies, for example, the effective amount may vary depending on the species, age, and general condition of the individual, the severity of the condition being treated, the particular polypeptide selected and its mode of administration, etc. It is also believed that effective amounts will be found within a relatively large, non-critical range. An appropriate effective amount can be readily determined using only routine experimentation. Preferred ranges of proteins for prophylaxis of HCV disease are 0.01 to 100 μg/dose, preferably 0.1 to 50 μg/dose. Several doses may be needed per individual in order to achieve a sufficient immune response and subsequent protection against HCV disease.

[0212] The present invention also relates to a vaccine as defined above, comprising at least one (recombinant) polypeptide as defined above, with said polypeptide being unique for at least one of the subtypes or types as defined above.

[0213] Said vaccine compositions may include prophylactic as well as therapeutic vaccine compositions.

[0214] Pharmaceutically acceptable carriers include any carrier that does not itself induce the production of antibodies harmful to the individual receiving the composition. Suitable carriers are typically large, slowly metabolized macromolecules such as proteins, polysaccharides, polylactic acids, polyglycolic acids, polymeric amino acids, amino acid copolymers; and inactive virus particles. Such carriers are well known to those of ordinary skill in the art.

[0215] Preferred adjuvants to enhance effectiveness of the composition include, but are not limited to: aluminim hydroxide (alum), N-acetyl-muramyl-L-threonyl-D-isoglutamine (thr-MDP) as found in U.S. Pat. No. 4,606,918, N-acetyl-normuramyl-L-alanyl-D-isoglutamine (nor-MDP), N-acetylmuramyl-L-alanyl-D-isoglutaminyl-L-alanine-2-(1′-2′-dipalmitoyl-sn-glycero-3-hydroxyphosphoryloxy)-ethylamine (MTP-PE) and RIBI, which contains three components extracted from bacteria, monophosphoryl lipid A, trehalose dimycolate, and cell wall skeleton (MPL+TDM+CWS) in a 2% squalene/Tween 80 emulsion. Any of the 3 components MPL, TDM or CWS may also be used alone or combined 2 by 2. Additionally, adjuvants such as Stimulon (Cambridge Bioscience, Worcester, Mass.).

[0216] Immunogenic compositions used as vaccines comprise a ‘sufficient amount’ or ‘an immunologically effective amount’ of the proteins of the present invention, as well as any other of the above mentioned components, as needed. ‘Immunologically effective amount’, means that the administration of that amount to an individual, either in a single dose or as part of a series, is effective for treatment, as defined above. This amount varies depending upon the health and physical condition of the individual to be treated, the taxonomic group of individual to be treated (e.g. nonhuman primate, primate, etc.), the capacity of the individual's immune system to synthesize antibodies, the degree of protection desired, the formulation of the vaccine, the treating doctor's assessment of the medical situation, the strain of infecting HCV, and other relevant factors. It is expected that the amount will fall in a relatively broad range that can be determined through routine trials. Usually, the amount will vary from 0.01 to 1000 μg/dose, more particularly from 0.1 to 100 μg/dose.

[0217] The proteins of the invention may also serve as vaccine carriers to present homologous (e.g. T cell epitopes or B cell epitopes fromfor istance the core,E1, E2, NS2, NS3, NS4 or NS5 regions) or heterologous (non-HCV) haptens, in the same manner as Hepatitis B surface antigen (see European Patent Application 174,444). In this use, envelope proteins provide an immunogenic carrier capable of stimulating an immune response to haptens or antigens conjugated to the aggregate. The antigen may be conjugated either by conventional chemical methods, or may be cloned into the gene encoding E1 and/or E2 at a location corresponding to a hydrophilic region of the protein. Such hydrophylic regions include the V1 region (encompassing amino acid positions 191 to 202), the V2 region (encompassing amino acid positions 213 to 223), the V3 region (encompassing amino acid positions 230 to 242), the V4 region (encompassing amino acid positions 230 to 242), the V5 region (encompassing amino acid positions 294 to 303) and the V6 region (encompassing amino acid positions 329 to 336). Another useful location for insertion of haptens is the hydrophobic region (encompassing approximately amino acid positions 264 to 293). It is shown in the present invention that this region can be deleted without affecting the reactivity of the deleted E1 protein with antisera. Therefore, haptens may be inserted at the site of the deletion.

[0218] The immunogenic compositions are conventionally administered parenterally, typically by injection, for example, subcutaneously or intramuscularly. Additional formulations suitable for other methods of administration include oral formulations and suppositories. Dosage treatment may be a single dose schedule or a multiple dose schedule. The vaccine may be administered in conjunction with other immunoregulatory agents.

[0219] The administration of the immunogen(s) of the present invention may be for either a prophylactic or therapeutic purpose. When provided prophylactically, the immunogen(s) is provided in advance of any exposure to HCV or in advance of any symptom of any symptoms due to HCV infection. The prophylactic administration of the immunogen serves to prevent or attenuate any subsequent infection of HCV in a mammal. When provided therapeutically, the immunogen(s) is provided at (or shortly after) the onset of the infection or at the onset of any symptom of infection or disease caused by HCV. The therapeutic administration of the immunogen(s) serves to attenuate the infection or disease.

[0220] In addition to use as a vaccine, the compositions can be used to prepare antibodies to HCV (E1) proteins. The antibodies can be used directly as antiviral agents. To prepare antibodies, a host animal is immunized using the E1 proteins native to the virus particle bound to a carrier as described above for vaccines. The host serum or plasma is collected following an appropriate time interval to provide a composition comprising antibodies reactive with the (E1) protein of the virus particle. The gamma globulin fraction or the IgG antibodies can be obtained, for example, by use of saturated ammonium sulfate or DEAE Sephadex, or other techniques known to those skilled in the art. The antibodies are substantially free of many of the adverse side effects which may be associated with other anti-viral agents such as drugs.

[0221] The present invention also relates particularly to a peptide corresponding to an amino acid sequence encoded by at least one of the HCV genomic sequences as defined above, with said peptide being unique to any of the HCV subtypes or types as defined in Table 5, and which contains at least one amino acid differing from any of the known HCV types or subtypes, or an analog thereof being substantially homologous and biologically equivalent.

[0222] The present invention relates particularly to a peptide comprising at least one unique epitope of the new sequences of the invention as represented in SEQ ID NO 1 to 106.

[0223] The present invention relates also particularly to a peptide comprising in its sequence a unique amino acid residue of the invention as defined above.

[0224] The present invention relates particularly to a peptide which is biotinylated as explained in WO 93/18054.

[0225] All the embodiments (immunoassay formats, vaccines, compositions, uses, etc.) illustrated for the polypeptides of the invention as above also relate to the peptides of the invention.

[0226] The present invention also relates to a method for detecting antibodies to HCV present in a biological sample, comprising:

[0227] (i) contacting the biological sample to be analysed for the presence of HCV with a peptide as defined above,

[0228] (ii) detecting the immunological ccomplex formed between said antibodies and said peptide.

[0229] The present invention also relates to a method for HCV typing, comprising:

[0230] (i) contacting the biological sample to be analysed for the presence of HCV with a peptide as defined above,

[0231] (ii) detecting the immunological ccomplex formed between said antibodies and said peptide.

[0232] The present invention also relates to a diagnostic kit for use in detecting the presence of HCV, said kit comprising at least one peptide as defined above, with said peptide being preferably bound to a solid support.

[0233] The present invention also relates to a diagnostic kit for HCV typing, said kit comprising at least one peptide as defined above, with said peptide being preferably bound to a solid support.

[0234] The present invention also relates to a diagnostic kit as defined above, wherein said peptides are selected from the following:

[0235] at least one NS4 peptide,

[0236] at least one NS4 peptide and at least one Core peptide,

[0237] at least one NS4 peptide and at least one Core peptide and at least one E1 peptide,

[0238] at least one NS4 peptide and at least one E1 peptide.

[0239] The present invention also relates to a diagnostic kit as defined above, said kit comprising a range of said peptides which are attached to specific locations on a solid substrate.

[0240] The present invention also relates to a diagnostic kit as defined above, wherein said solid support is a membrane strip and said peptides are coupled to the membrane in the form of parallel lines.

[0241] The present invention also relates to a pharmaceutical composition comprising at least one as defined above and a suitable excipient, diluent or carrier. the present invention also relates to a method of preventing HCV infection, comprising administering the pharmaceutical composition as defined above to a mammal in effective amount to stimulate the production of protective antibody or protective T-cell response.

[0242] The present invention also relates to the use of a composition as defined above in a method for preventing HCV infection.

[0243] The present invention also relates to a vaccine for immunizing a mammal against HCV infection, comprising at least one peptide as defined above, in a pharmaceutically acceptable carrier.

[0244] The present invention relates also to a vaccine as defined above, comprising at least one peptide as defined above, with said peptide being unique for at least one of the subtypes or types as defined in Table 5.

[0245] The present invention relates to an antibody raised upon immunization with at least one polypeptide or peptide as defined above, with said antibody being specifically reactive with any of said polypeptides or peptides, and with said antibody being preferably a monoclonal antibody.

[0246] The monoclonal antibodies of the invention can be produced by any hybridoma liable to be formed according to classical methods from splenic cells of an animal, particularly from a mouse or rat, immunized against the HCV polypeptides according to the invention as defined above on the one hand, and of cells of a myeloma cell line on the other hand, and to be selected by the ability of the hybridoma to produce the monoclonal antibodies recognizing the polypeptides which has been initially used for the immunization of the animals.

[0247] The antibodies involved in the invention can be labelled by an appropriate label of the enzymatic, fluorescent, or radioactive type.

[0248] The monoclonal antibodies according to this preferred embodiment of the invention may be humanized versions of mouse monoclonal antibodies made by means of recombinant DNA technology, departing from parts of mouse and/or human genomic DNA sequences coding for H and L chains or from cDNA clones coding for H and L chains.

[0249] Alternatively the monoclonal antibodies according to this preferred embodiment of the invention may be human monoclonal antibodies. These antibodies according to the present embodiment of the invention can also be derived from human peripheral blood lymphocytes of patients infected with HCV type 1 subtype 1d, 1e, 1f or 1g, HCV type 2 subtype 2e, 2f, 2g, 2h, 2i, 2k or 2l; HCV type 3, subtype 3g; HCV type 4 subtype 4k, 4l or 4m; and/or HCV type 7 (subtypes 7a, 7c or 7d), 9, 10 or 11, or vaccinated against HCV. Such human monoclonal antibodies are prepared, for instance, by means of human peripheral blood lymphocytes (PBL) repopulation of severe combined immune deficiency (SCID) mice (for recent review, see Duchosal et al. 1992) or by screening Eppstein Barr-virus-transformed lymphocytes of infected or vaccinated individuals for the presence of reactive B-cells by means of the antigens of the present invention.

[0250] The invention also relates to the use of the proteins of the invention, muteins thereof, or peptides derived therefrom for the selection of recombinant antibodies by the process of repertoire cloning (Persson et al., 1991).

[0251] Antibodies directed to peptides derived from a certain genotype may be used either for the detection of such HCV genotypes, or as therapeutic agents.

[0252] The present invention relates also to a method for detecting HCV antigens present in a biological sample, comprising:

[0253] (i) contacting said biological sample with an antibody as defined above,

[0254] (ii) detecting the immune compleexes formed between said HCV antigens and said antibody.

[0255] The present invention relates also to a method for HCV typing, comprising:

[0256] (i) contacting said biological sample with an antibody as defined above,

[0257] (ii) detecting the immune compleexes formed between said HCV antigens and said antibody.

[0258] The present invention relates also to a diagnostic kit for use in detecting the presence of HCV, said kit comprising at least one antibody as defined above, with said antibody being preferably bound to a solid support.

[0259] The present invention relates also to a diagnostic kit for HCV typing, said kit comprising at least one antibody as defined above, with said antibody being preferably bound to a solid support.

[0260] The present invention relates also to a diagnostic kit as defined above, said kit comprising a range of said antibodies which are attached to specific locations on a solid substrate.

[0261] The present invention relates also to a pharmaceutical composition comprising at least one antibody as defined above and a suitable excipient, diluent or carrier.

[0262] The present invention relates also to a method of preventing or treating HCV infection, comprising administering the pharmaceutical composition as defined above to a mammal in effective amount.

[0263] The present invention relates also to the use of a composition as defined above in a method for preventing or treating HCV infection.

[0264] The genotype may also be detected by means of a type-specific antibody as defined above, which may also linked to any polynucleotide sequence that can afterwards be amplified by PCR to detect the immune complex formed (Immuno-PCR, Sano et al., 1992).

[0265] Any publications or patent applications referred to herein are incorporated by reference. The following examples illustrate aspects of the invention but are in no way intended to limit the scope thereof.

FIGURE LEGENDS

[0266] Figure Legends

[0267]FIG. 1

[0268] Alignment of the nucleotide sequences of the Core/E1 region of some of the isolates of the newly identified types and subtypes of the present invention, with other known prototype isolates of subtypes.

[0269]FIG. 2

[0270] Alignment of the amino acid sequences of the Core/E1 region of some of the isolates of the newly identified types and subtypes of the present invention, with other known prototype isolates of subtypes. 15

[0271]FIG. 3

[0272] Nucleotide and amino acid sequences obtained from the new HCV isolates of the present invention (SEQ ID NO 1 to 106).

[0273]FIG. 4

[0274] Alignment of the amino acid sequences of the Core/E1 region of some of the isolates of the newly identified types and subtypes of the present invention, with other known prototype isolates of subtypes.

[0275]FIG. 5

[0276] Alignment of the nucleotide sequences of the NS5b region of some of the isolates of the newly identified types and subtypes of the present invention, with other known prototype isolates of subtypes.

[0277]FIG. 6

[0278] Alignment of the amino acid sequences of the NS5b region of some of the isolates of the newly identified types and subtypes of the present invention, with other known prototype isolates of subtypes.

[0279] Table 5

[0280] Overview of the new subtypes and types of the present invention and the regions sequenced. The subtypes between barckets have been replaced by the non-bracketed subtypes following the classification of Tokita et al. (1994).

EXAMPLES

[0281] Serum Samples.

[0282] Serum samples from Cameroonian blood donors (CAM) were screened for HCV antibodies with Innotest HCV Ab III, and confirmed by INNO-LIA HCV III (Innogenetics, Antwerp, Belgium). Serum samples from patients with chronic hepatitis C infection were obtained from various centers in the Benelux countries (BNL), from France (FR), from Pakistan (PAK), from Egypt (EG), and from Vietnam (VN).

[0283] Samples from the Benelux, Cameroon, France and Vietnam were selected because of their aberrant reactivities (isolates CAM1078, FR2, FR1, VN4, VN12, VN13, NE98 and others (see Table 5)).

[0284] cPCR, LiPA, Cloning and Sequencing.

[0285] RNA isolation, cDNA synthesis, PCR, cloning, and LiPA genotyping using biotinylated 5′ UR amplification products were performed as described (Stuyver et al., 1994c). The 5′ UR, the Core/E1, and the NS5B PCR products were used for direct sequencing. The sequence of the universal 5′ UR primers HCPr95, HCPr96, HCPr98, and HCPr29, were described previously (Stuyver et al. 1993b). The following primers were also described (Stuyver et al. 1994c): HCPr41, a sense primer for the amplification of the Core region; HCPr52 and HCPr54 for amplification of the Core/E1 region; and HCPr206 and HCPr207 for amplication of a 340-bp NS5B region.

[0286] Serum samples BNL1, BNL2, BNL3, BNL4, BNL5, BNL6, BNL7, BNL8, BNL9, BNL10, BNL11, BNL12, CAM1078, FR2, FR16, FR4, FR13, VN13, VN4, VN12, FR1, NE98, and FR19 were analyzed in the Core/E1 region by direct sequencing. Serum samples BNL1, BNL2, FR17, CAM1078, FR2, FR16, BNL3, FR4, BNL5, FR13, FR18, PAK64, BNL8, BNL12, EG81, VN13, VN4, VN12, FR1, NE98, FR14, FR15, and FR19 were also analyzed in the NS5B region by direct sequencing. Partial 5′ UR, Core, E1, and NS5B sequences were obtained. The length of the obtained sequences is sufficient to classify the obtained sequences into new types or subtypes, based on the phylogenetic distances to known sequences. The following sequences could be obtained (nucleotide sequences have odd-numbered SEQ ID NO., amino acid sequences have even-numbered SEQ ID NO.): SEQ ID NO 1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, 67, 69, 71, 73, 75, 77, 79, 81, 83, 85, 87, 89, 91, 93, 95, 97, 99, 101, 103 and 105. The amino acid sequences deduced therefrom are given in SEQ ID NO 2, 4, 7, 9, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 20 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98, 100, 102, 104 and 106. Table 5 gives an overview of these sequences. TABLE 5 Type Isolate Nucleotide sequence position  1d BNL1 1-310 (SEQ ID NO.1) 478-925 (SEQ ID NO.3) 7932-8271 (SEQ ID NO.53)  1d BNL2 1-310 (SEQ ID NO.5) 478-925 (SEQ ID NO.7) 7932-8271 (SEQ ID NO.55)  1d FR17 7932-8271(SEQ ID NO.57)  1e CAM1078 1-223 (SEQ ID NO.9) (−238)-414 (SEQ ID NO.59) 7932-8271 (SEQ ID NO.61)  1f FR2 1-950 (SEQ ID NO.11) 7932-8271 (SEQ ID NO.63)  1g FR16 (−15)-816 (SEQ ID NO.65) 7932-8271 (SEQ ID NO.67)  2e BNL3 1-310 (SEQ ID NO.13) 478-957 (SEQ ID NO.15) 7932-8271 (SEQ ID NO.69)  2f FR4 1-957 (SEQ ID NO 17) 7932-8271 (SEQ ID NO.71)  2g BNL4 478-925 (SEQ ID NO.19)  2h BNL5 1-310 (SEQ ID NO.21) 478-925 (SEQ ID NO.23) 7932-8271 (SEQ ID NO.73)  2i BNL6 478-833 (SEQ ID NO.25)  2k FR13 (−238)-957 (SEQ ID NO.75) 7932-8271 (SEQ ID NO.77)  2l FR18 7932-8271 (SEQ ID NO.79)  3g PAK64 7932-8271 (SEQ ID NO.81)  4k BNL7 1-310 (SEQ ID NO.27) 478-925 (SEQ ID NO.29)  4k BNL8 478-925 (SEQ ID NO.31) 7932-8271 (SEQ ID NO.83)  4k BNL9 478-925 (SEQ ID NO.33)  4k BNL10 478-925 (SEQ ID NO.35)  4k BNL11 478-925 (SEQ ID NO.37)  4l BNL12 478-925 (SEQ ID NO.39) 7932-8271 (SEQ ID NO.85)  4m EG81 7932-8271 (SEQ ID NO.87)  7a (8b) VN13 1-413 (SEQ ID NO.45) 7932-8271 (SEQ ID NO.89)  7c (8a) VN4 1-957 (SEQ ID NO.43) 7932-8271 (SEQ ID NO.91)  7d (9a) VN12 1-957 (SEQ ID NO.47) 7932-8271 (SEQ ID NO.93)  9a (7a) FR1 1-957 (SEQ ID NO.41) 7932-8271 (SEQ ID NO.95) 10a NE98 1-310 (SEQ ID NO.49) 478-925 (SEQ ID NO.51) 7932-8271 (SEQ ID NO.97) 11a FR14 7932-8266 (SEQ ID NO.99) 11a FR15 7932-8271 (SEQ ID NO.101) 11a FR19 (−238)-223 (SEQ ID NO.103) 7932-8271 (SEQ ID NO.105) Amino acid sequence position  1d BNL1 1-103 (SEQ ID NO.2) 159-308 (SEQ ID NO.4) 2645-2757 (SEQ ID NO.54)  1d BNL2 1-103 (SEQ ID NO.6) 159-308 (SEQ ID NO.8) 2645-2757 (SEQ ID NO.56)  1d FR17 2645-2757 (SEQ ID NO.58)  1e CAM1078 1-74 (SEQ ID NO.10) 1-138 (SEQ ID NO.60) 2645-2757 (SEQ ID NO.62)  1f FR2 1-316 (SEQ ID NO.12) 2645-2757 (SEQ ID NO.64)  1g FR16 1-158 (SEQ ID NO.66) 2645-2757 (SEQ ID NO.68)  2e BNL3 1-103 (SEQ ID NO.14) 159-317 (SEQ ID NO.16) 2645-2757 (SEQ ID NO.70)  2f FR4 1-317 (SEQ ID NO.18) 2645-2757 (SEQ ID NO.72)  2g BNL4 159-308 (SEQ ID NO.20)  2h BNL5 1-103 (SEQ ID NO.22) 159-308 (SEQ ID NO.24) 2645-2757 (SEQ ID NO.74)  2i BNL6 159-277 (SEQ ID NO.26)  2k FR13 1-316 (SEQ ID NO.76) 2645-2757 (SEQ ID NO.78)  2l FR18 2645-2757 (SEQ ID NO.80)  3g PAK64 2645-2757 (SEQ ID NO.82)  4k BNL7 1-103 (SEQ ID NO.28) 159-308 (SEQ ID NO.30)  4k BNL8 159-308 (SEQ ID NO.32) 2645-2757 (SEQ ID NO.84)  4k BNL9 159-308 (SEQ ID NO.34)  4k BNL10 159-308 (SEQ ID NO.36)  4k BNL11 159-308 (SEQ ID NO.38)  4l BNL12 159-308 (SEQ ID NO.40) 2645-2757 (SEQ ID NO.86)  4m EG81 2645-2757 (SEQ ID NO.88)  7a (8b) VN13 1-137 (SEQ ID NO.46) 2645-2757 (SEQ ID NO.90)  7c (8a) VN4 1-317 (SEQ ID NO.44) 2645-2757 (SEQ ID NO.92)  7d (9a) VN12 1-317 (SEQ ID NO.48) 2645-2757 (SEQ ID NO.94)  9a (7a) FR1 1-317 (SEQ ID NO.42) 2645-2757 (SEQ ID NO.96) 10a NE98 1-103 (SEQ ID NO.50) 159-308 (SEQ ID NO.52) 2645-2757 (SEQ ID NO.98) 11a FR14 2645-2755 (SEQ ID NO.100) 11a FR15 2645-2757 (SEQ ID NO.102) 11a FR19 1-74 (SEQ ID NO.104) 2645-2757 (SEQ ID NO.106)

[0287] Phylogenetic Analysis.

[0288] Previously published sequences were taken from the EMBL/Genbank database. Alignments were created using the program HCVALIGN (Stuyver et al. 1994c). Sequences were presented in a sequential format to the Phylogeny Inference Package (PHYLIP) version 3.5c (public domain program freely available from the University of Washington, Seattle, USA). Distance matrices were produced by DNADIST using the Kimura 2-parameter setting and further analyzed in NEIGHBOR, using the neighbor-joining setting. The program DRAWTREE was used to create graphic outputs.

[0289] Identification of New Subtypes

[0290] These analyses indicated the clustering of BNL1, BNL2, CAM 1078, FR2, FR16, and FR17 with type 1 isolates, yet neither of these sequences clustered together with any of the known type 1 subtypes 1a, 1b, or 1c. BNL1, BNL2, and FR17 clearly clustered together and could be assigned a new type 1 subtype 1d, while CAM1078 could be classified into another new subtype 1e, FR2 could be classified into another type 1 subtype 1f, and FR16 could be classified into yet another type 1 subtype 1g. Interestingly, all 3 type 1d isolates (BNL1, BNL2, and FR17) and 1g isolate FR16 were obtained from patients of Moroccan ethnic origin who resided in Europe.

[0291] Another group of isolates showed homology to other type 2 sequences, but none of the isolates BNL3, FR4, BNL4, BNL5, BNL6, FR13, or FR18 could be classified into one of the known type 2 subtypes 2a, 2b, 2c (Bukh et al., 1993), or 2d (Stuyver et al., 1994c). Based on the phylogenetic distances to other type 2 isolates and to other isolates of the group, each of these isolates could be classified into a new type 2 subtype. BNL3 was assigned subtype 2e, FR4 subtype 2f, BNL4 subtype 2g, BNL5 subtype 2h, and BNL6 could be classified into yet another type 2 subtype 2i. If the previously published isolate HN4 is classified as 2j, FR13 and FR18 may be classified into new type 2 subtypes 2k and 2I. However, the possibility that FR13 and FR18 could belong to subtypes 2g or 2i has not yet been ruled out. Definite classification can be obtained by determining the NS5B sequences of isolates BNL4 and BNL6, belonging to subtypes 2g and 2i, respectively.

[0292] Isolate PAK64 showed homology to type 3 sequences, but could not be classified into one of the known type 3 subtypes 3a to f. Based on the phylogenetic distances to other type 3 isolates, PAK64 could be classified into a new type 3 subtype. PAK64 was assigned subtype 3g. However, the possibility that PAK64 belongs to a known type 3 subtype can not be strictly ruled out since only one region of the genome has been sequenced. Definite classification can be obtained by determining the Core/E1 sequences of isolate PAK64 after amplification with primerHcPr52 and HcPr54.

[0293] Among the Benelux and Egyptian samples that were analyzed, some sequences clustered with the previously identified type 4 subtypes 4c and 4d. However, BNL7, BNL8, BNL9, BNL10, BNL11, BNL12, and EG81 clustered into new subtypes of type 4. Isolates BNL7, BNL8, BNL9, BNL10, and BNL11 clustered again separately from BNL12 and EG81 into a new subtype 4k. This subtype was the predominant subtype in the Benelux countries. BNL12 and EG81 also segregated into separate subtypes. BNL12 was assigned to another new subtype 4l and EG81 was assigned to yet another new subtype 4m.

[0294] Identification of New HCV Major Types

[0295] Isolates FR1, VN4, VN12, VN13, NE98, FR14, FR15, and FR19 did not cluster with any of the known 6 major types of HCV. VN4, VN12, and VN13 were very distantly related to genotype 6, but phylogenetic analysis indicated that these isolates should be assigned new major types. VN13, VN4 and VN12 were related at the subtype level and assigned type 7a, 7c, and 7d, respectively. FR1 was not related to any known isolate and was assigned genotype 9a. NE98 shows a distant relatedness to type 3 sequences, yet phylogenetic analysis suggested classification into a new major type 10a. Depending on international guidelines for assigning type and subtype levels, NE98 may also be classified into an additional type 3 subtype. FR14, FR15, and FR19 show a very distant relatedness to type 2 sequences, yet phylogenetic analysis indicated thes isolates to be classified into a new major type 11, all belonging to the same subtype designated 11a. Depending on international guidelines for assigning type and subtype levels, FR14, FR15, and FR19 may also be classified into an additional type 2 subtype.

REFERENCES

[0296] Barany F (1991). Genetic disease detection and DNA amplification using cloned thermostable ligase. Proc Natl Acad Sci USA 88:189-193.

[0297] Bej A, Mahbubani M, Miller R, Di Cesare J, Haff L, Atlas R (1990) Mutiplex PCR amplification and immobilized capture probes for detection of bacterial pathogens and indicators in water. Mol Cell Probes 4:353-365.

[0298] Bukh J, Purcell R, Miller R (1992). Sequence analysis of the 5′ noncoding region of hepatitis C virus. Proc Natl Acad Sci USA 89:4942-4946.

[0299] Bukh J, Purcell R, Miller R (1993). At least 12 genotypes of hepatitis C virus predicted by is sequence analysis of the putative E1 gene of isolates collected worldwide. Proc. Natl. Acad. Sci. USA 90,8234-8238.

[0300] Cha T, Beal E, Irvine B, Kolberg J, Chien D, Kuo G, Urdea M (1992) At least five related, but distinct, hepatitis C viral genotypes exist. Proc Natl Acad Sci USA 89:7144-7148.

[0301] Chan S-W, Simmonds P, McOmish F, Yap P, Mitchell R, Dow B, Follett E (1991) Serological responses to infection with three different types of hepatitis C virus. Lancet 338:1991.

[0302] Chan S-W, McOmish F, Holmes E, Dow B, Peutherer J, Follett E, Yap P, Simmonds P (1992) Analysis of a new hepatitis C virus type and its phylogenetic relationship to existing variants. J Gen Virol 73:1131-1141.

[0303] Chomczynski P, Sacchi N (1987) Single step method of RNA isolation by acid guanidinium thiocyanate-phenol-chloroform extraction. Anal Biochem 162:156-159.

[0304] Choo Q, Richman K, Han J, Berger K, Lee C, Dong C, Gallegos C, Coit D, Medina-Selby A, Barr P, Weiner A, Bradley D, Kuo G, Houghton M (1991) Genetic organization and diversity of the hepatitis C virus. Proc Natl Acad Sci USA 88:2451-2455.

[0305] Compton J (1991). Nucleic acid sequence-based amplification. Nature, 350: 91-92.

[0306] Duchosal A, Eming S, Fisher P (1992) Immunization of hu-PBL-SCID mice and the resue of human monoclonal Fab fragments through combinatorial libraries. Nature 355:258-262.

[0307] Duck P (1990). Probe amplifier system based on chimeric cycling oligonucleotides. Biotechniques 9, 142-147.

[0308] Guatelli J, Whitfield K, Kwoh D, Barringer K, Richman D, Gengeras T (1990) Isothermal, in vitro amplification of nucleic acids by a multienzyme reaction modeled after retroviral replication. Proc Natl Acad Sci USA 87: 1874-1878.

[0309] Hijikata M, Kato N, Ootsuyama Y, Nakagawa M, Shimotohmo K (1991) Gene mapping of the putative structural region of the hepatitis C virus genome by in vitro processing analysis. Proc Natl Acad Sci USA 88, 5547-5551.

[0310] Jacobs K, Rudersdorf R, Neill S, Dougherty J, Brown E, Fritsch E (1988) The thermal stability of oligonucleotide duplexes is sequence independent in tetraalkylammonium salt solutions: application to identifying recombinant DNA clones. Nucl Acids Res 16:4637-4650.

[0311] Kato N, Hijikata M, Ootsuyama Y, Nakagawa M, Ohkoshi S, Sugimura T, Shimotohno K (1990) Molecular cloning of the human hepatitis C virus genome from Japanese patients with non-A, non-B hepatitis. Proc Natl Acad Sci USA 87:9524-9528.

[0312] Kwoh D, Davis G, Whitfield K, Chappelle H, Dimichele L, Gingeras T (1989). Transcription-based amplification system and detection of amplified human immunodeficiency virus type 1 with a bead-based sandwich hybridization format. Proc Natl Acad Sci USA, 86: 1173-1177.

[0313] Kwok S, Kellogg D, McKinney N, Spasic D, Goda L, Levenson C, Sinisky J, (1990). Effects of primer-template mismatches on the polymerase chain reaction: Human immunodeficiency views type 1 model studies. Nucl. Acids Res., 18: 999.

[0314] Landgren U, Kaiser R, Sanders J, Hood L (1988). A ligase-mediated gene detection technique. Science 241:1077-1080.

[0315] Lizardi P, Guerra C, Lomeli H, Tussie-Luna I, Kramer F (1988) Exponential amplification of recombinant RNA hybridization probes. Bio/Technology 6:1197-1202.

[0316] Lomeli H, Tyagi S, Printchard C, Lisardi P, Kramer F (1989) Quantitative assays based on the use of replicatable hybridization probes. Clin Chem 35: 1826-1831.

[0317] Machida A, Ohnuma H, Tsuda F, Munekata E, Tanaka T, Akahane Y, Okamoto H, Mishiro S (1992) Hepatology 16, 886-891.

[0318] Maniatis T, Fritsch E, Sambrook J (1982) Molecular cloning: a laboratory manual. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y.

[0319] Mori S, Kato N, Yagyu A, Tanaka T, Ikeda Y, Petchclai B, Chiewsilp P, Kurimura T, Shimotohno K (1992) A new type of hepatitis C virus in patients in Thailand. Biochem Biophys Res Comm 183:334-342.

[0320] Okamoto H, Okada S, Sugiyama Y, Kurai K, lizuka H, Machida A, Miyakawa Y, Mayumi M (1991) Nucleotide sequence of the genomic RNA of hepatitis C virus isolated from a human carrier: comparison with reported isolates for conserved and divergent regions. J Gen Virol 72:2697-2704.

[0321] Okamoto H. Kurai K, Okada S, Yamamoto K, Lizuka H, Tanaka T, Fukuda S, Tsuda F,

[0322] Mishiro S (1992) Full-length sequences of a hepatitis C virus genome having poor homology to reported isolates: comparative study of four distinct genotypes. Virology 188:331-341.

[0323] Persson M, Caothien R, Burton D (1991). Generation of diverse high-affinity human monoclonal antibodies by repertoire cloning. Proc Natl Acad Sci USA 89:2432-2436. Saiki R, Gelfand D, Stoffel S, Scharf S, Higuchi R, Horn G, Mullis K, Erlich H (1988). Primer-directed enzymatic amplification of DNA with a thermostable DNA polymerase. Science 239:487-491.

[0324] Saiki R, Walsh P, Levenson C, Erlich H (1989) Genetic analysis of amplified DNA with immobilized sequence-specific oligonucleotide probes (1989) Proc Natl Acad Sci USA 86:6230-6234.

[0325] Sano T, Smith C, Cantor C (1992) Immuno-PCR: very sensitive antigen detection by means of specific antibody-DNA conjugates. Science 258:120-122.

[0326] Simmonds P, McOmsh F, Yap P, Chan S, Lin C, Dusheiko G, Saeed A, Holmes E (1993a), Sequence variability in the 5′ non-coding region of hepatitis C virus: identification of a new virus type and restrictions on sequence diversity. J Gen Virology, 74:661-668.

[0327] Stuyver L, Rossau R, Wyseur A, Duhamel M, Vanderborght B, Van Heuverswyn H, Maertens G (1993b) Typing of hepatitis C virus (HCV) isolates and characterization of new (sub)types using a Line Probe Assay. J Gen Virology, 74: 1093-1102.

[0328] Tokita et al. (1994) Hepatitis C virus vraiants from Vietnam are classifiable into the seventh, eighth, and ninth major genetic groups. Proc. Natl. Acad. Sci, 91: 11022-11026.

[0329] Walker G, Little M, Nadeau J, Shank D (1992). Isothermal in vitro amplification of DNA by a restriction enzyme/DNA polymerase system. Proc Natl Acad Sci USA 89:392-396.

[0330] Wu D, Wallace B (1989). The ligation amplification reaction (LAR)—amplification of specific DNA sequences using sequential rounds of template-dependent ligation. Genomics 4:560-569.

[0331] Miller P, Yano J, Yano E, Carroll C, Jayaram K, Ts'o P (1979) Nonionic nucleic acid analogues. Synthesis and characterization of dideoxyribonucleoside methylphosphonates. Biochemistry 18(23):5134-43.

[0332] Nielsen P, Egholm M, Berg R, Buchardt 0 (1991) Sequence-selective recognition of DNA by strand displacement with a thymine-substituted polyamide. Science 254(5037):1497-500.

[0333] Nielsen P, Egholm M, Berg R, Buchardt 0 (1993) Sequence specific inhibition of DNA restriction enzyme cleavage by PNA. Nucleic-Acids-Res. 21(2):197-200.

[0334] Asseline U, Delarue M, Lancelot G, Toulme F, Thuong N (1984) Nucleic acid-binding molecules with high affinity and base sequence specificity: intercalating agents covalently linked to oligodeoxynucleotides. Proc. Natl. Acad. Sci. USA 81(11):3297-301.

[0335] Matsukura M, Shinozuka K, Zon G, Mitsuya H, Reitz M, Cohen J, Broder S (1987) Phosphorothioate analogs of oligodeoxynucleotides: inhibitors of replication and cytopathic effects of human immunodeficiency virus. Proc. Natl. Acad. Sci. USA 84(21):7706-10.

[0336] Maertens, G., Ducatteeuw, A., Stuyver, L., Vandeponseele, P., Venneman, A., Wyseur, A., Bosman, F., Heijtink, R. & de Martynoff, G. (1994) Low prevalence of anti-E1 antibodies reactive to recombinant type 1b E1 envelope protein in type 2, 3, and 4 sera, but high prevalence in subtypes 1a and 1b. In: Viral Hepatitis and Liver Disease, Proceedings of the International Symposium on Viral Hepatitis and Liver Disease (Eds. Nishioka, K., Suzuki, H., Mishiro, S., and Oda, T.), pp 314-316, Springer-Verlag Tokyo.

[0337] Simmonds, P., Rose, K. A., Graham, S., Chan, S.-W., McOmish, F., Dow, B. C., Follett, E. A. C., Yap, P. L., & Marsden, H. (1993b) Mapping of serotype-specific, immunodominant epitopes in the NS4 region of hepatitis C virus (HCV): Use of type-specific peptides to serologically discriminate infections with HCV type 1, 2, and 3. J. Clin. Microbiol. 31, 1493-1503.

[0338] Simmonds, P., Holmes, E. C., Cha, T.-A., Chan, S.-W., McOmish, F., Irvine, B., Beall, E., Yap, P.L., Kolberg, J., & Urdea, M. S. (1993c) J. Gen. Virol. 74, 2391-2399.

[0339] Stuyver, L., Van Arnhem, W., Wyseur, A. & Maertens, G. (1994) Cloning and phylogenetic analysis of the Core, E2, and NS314 regions of hepatitis C virus type 5a. Biochem. Biophys. Res. Comm. 202, 1308-1314.

[0340] Simmonds, P., Alberti, A., Alter, H., Bonino, F., Bradley, D. W., Brechot, C., Brouwer, J., Chan, S.-W., Chayama K., Chen, D.-S., Choo, Q.-L., Colombo, M., Cuypers, T., Date, T., Dusheiko, G., Esteban, J.l., Fay, O., Hadziyannis, S., Han, J., Hatzakis, A., Holmes, E. C., Hotta, H., Houghton, M., Irvine, B., Kohara, M., Kolberg, J. A., Kuo, G., Lau, J. Y. N., Lelie, P. N., Maertens, G., McOmish, F., Miyamura, T., Mizokami, M., Nomoto, A., Prince A.M., Reesink, H.W., Rice, C., Roggendorf, M., Schalm, S., Shikata, T., Shimotohno, K., Stuyver, L, Trépo, C., Weiner, A., Yap, P. L. & Urdea, M. S. (1994) A proposed system for the nomenclature of hepatitis C virus genotypes. Hepatology 19, 1321-1324.

[0341] Stuyver, L., Van Arnhem, W., Wyseur, A., DeLeys, R. & Maertens, G. (1993a) Analysis of the putative E1 envelope and NS4a epitope regions of HCV type 3. Biochem. Biophys. Res. Comm. 192, 635-641.

[0342] Stuyver, L., Rossau, R., Wyseur, A., Duhamel, M., Vanderborght, B., Van Heuverswyn, H. & Maertens, G. (1993b) Typing of hepatitis C virus isolates and characterization of new subtypes using a line probe assay. J. Gen Virol. 74, 1093-1102.

[0343] Stuyver, L., Wyseur, A., Van Arnhem, W., Rossau, R., Delaporte, E., Dazza, M.-C., Van Doom, L.-J., Kleter, B. & Maertens, G. (1994a) The use of a line probe assay as a tool to detect new types or subtypes of hepatitis C virus. In: Viral Hepatitis and Liver Disease, Proceedings of the International Symposium on Viral Hepatitis and Liver Disease (Eds. Nishioka, K., Suzuki, H., Mishiro, S., and Oda, T.), pp 317-319, Springer-Verlag Tokyo.

[0344] Stuyver, L., Van Arnhem, W., Wyseur, A. & Maertens, G. (1994b) Cloning and Phylogenetic analysis of the Core, E2, and NS3/4 regions of the hepatitis C virus type 5a. Biochem. Biophys. Res. Comm. 202,1308-1314.

[0345] Stuyver, L., Van Arnhem, W., Wyseur, A., Hernandez, F., Delaporte, E., & Maertens, G. (1994c) Classification of hepatitis C viruses based on phylogenetics analysis of the E1 and NS5B regions and identification of 5 new subtypes. Proc. Natl. Acad. Sci. USA 91.

[0346] Stuyver et al. (1995) Hepatitis C virus genotyping by means of 5′-UR/core line probe assays and molecular analysis of untypeable samples. Virus Reasearch (in press).

1 207 327 base pairs nucleic acid single linear cDNA NO NO 1 ATGAGCACGA ATCCTAAACC TCAAAGAAAA ACCAAACGTA ACACCAACCG CCGCCCTCAK 60 GGSGTNNNNN NNCCGGGTGG CGGTCAGATC GTTGGTGGAG TTTACCTGTT GCCGCGCAGG 120 GGCCCCAGGN NGGGTGTGCG CGCGACTAGG AAGACTTCCG AGCGGTCACA ACCTCGTGGC 180 AGGCGACAGC CTATCCCCAA GGCTCGYCGG YCCGAGGGCA GGTCCTGGGC TCAGCCCGGG 240 TATCCTTGGC CCCTCTATGG CAATGAGGGC TGCGGGTGGG CGGGNTGGCT CCTGTCCCCC 300 CGCGGCTCTC GGCCCAATTG GGGCCCC 327 109 amino acids amino acid linear peptide 2 Met Ser Thr Asn Pro Lys Pro Gln Arg Lys Thr Lys Arg Asn Thr As 1 5 10 15 Arg Arg Pro Xaa Xaa Xaa Xaa Xaa Pro Gly Gly Gly Gln Ile Val Gl 20 25 30 Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg Xaa Gly Val Arg Al 35 40 45 Thr Arg Lys Thr Ser Glu Arg Ser Gln Pro Arg Gly Arg Arg Gln Pr 50 55 60 Ile Pro Lys Ala Xaa Arg Xaa Glu Gly Arg Ser Trp Ala Gln Pro Gl 65 70 75 80 Tyr Pro Trp Pro Leu Tyr Gly Asn Glu Gly Cys Gly Trp Ala Xaa Tr 85 90 95 Leu Leu Ser Pro Arg Gly Ser Arg Pro Asn Trp Gly Pro 100 105 447 base pairs nucleic acid single linear cDNA NO NO 3 GACGGCGTGA ACTATGCAAC AGGGAACTTG CCCGGTTGCT CTTTCTCTAT CTTCCTCTTG 60 GCTTTGCTGT CCTGCTTGAC GGTTCCAACK ACCGCTCACG AGGTGCGCAA CGCATCCGGG 120 GTGTATCATG TCACCAACGA CTGTTCCAAC TCGAGCATCA TCTATGAGAT GGACGGTATG 180 ATCATGCACT ACCCAGGGTG CGTGCCCTGC GTTCGGGAGG ATAACCATCT CCGCTGCTGG 240 ATGGCGCTCA CCCCCACGCT TGCGGTCAAA AAYGCTAGTG TCCCCACTRC GGCAATCCGA 300 CGTCACGTCG ACTTGCTTGT TGGGGGNNCC ACGTTCTGTT CCGCTATGTA CGTGGGRGAC 360 CTTTGCGGGT CTGTCTTCCT CGCTGGCCAG CTATTCACCT TTTCACCCCG CATGCACCAT 420 ACAACGCAGG AGTGCAACTG CTCAATC 447 149 amino acids amino acid linear peptide 4 Asp Gly Val Asn Tyr Ala Thr Gly Asn Leu Pro Gly Cys Ser Phe Se 1 5 10 15 Ile Phe Leu Leu Ala Leu Leu Ser Cys Leu Thr Val Pro Xaa Thr Al 20 25 30 His Glu Val Arg Asn Ala Ser Gly Val Tyr His Val Thr Asn Asp Cy 35 40 45 Ser Asn Ser Ser Ile Ile Tyr Glu Met Asp Gly Met Ile Met His Ty 50 55 60 Pro Gly Cys Val Pro Cys Val Arg Glu Asp Asn His Leu Arg Cys Tr 65 70 75 80 Met Ala Leu Thr Pro Thr Leu Ala Val Lys Xaa Ala Ser Val Pro Th 85 90 95 Xaa Ala Ile Arg Arg His Val Asp Leu Leu Val Gly Xaa Xaa Thr Ph 100 105 110 Cys Ser Ala Met Tyr Val Xaa Asp Leu Cys Gly Ser Val Phe Leu Al 115 120 125 Gly Gln Leu Phe Thr Phe Ser Pro Arg Met His His Thr Thr Gln Gl 130 135 140 Cys Asn Cys Ser Ile 145 327 base pairs nucleic acid single linear cDNA NO NO 5 ATGAGCACGA ATCCTAAACC TCAAAGAAAA ACCAAACGTA ACACCAACCG CCGCCCACAG 60 GACGTCAAGN TCCCGGGTGG TGGTCAGATC GTTGGTGGAG TTTACCTGTT GCCGCGCAGG 120 GGCCCCAGGT TGGGTGTGCG CGCGACCAGG AAGACTTCCG AGCGGTCGCA GCCTCGTGAC 180 AGGCGACAGC CTATTCCTAA GGCTCGCCAG TCCGATGGCA GNNCCTGGGC TCAGCCAGGG 240 CATCCCTGGC CCCTCTATGG CAATGAGGGC TGCGGATGGG CGGGATGGCT CCTGTCCCCC 300 CGCGGCTCTC GGCCCAGTTG GGGCCCC 327 109 amino acids amino acid linear peptide 6 Met Ser Thr Asn Pro Lys Pro Gln Arg Lys Thr Lys Arg Asn Thr As 1 5 10 15 Arg Arg Pro Gln Asp Val Lys Xaa Pro Gly Gly Gly Gln Ile Val Gl 20 25 30 Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg Leu Gly Val Arg Al 35 40 45 Thr Arg Lys Thr Ser Glu Arg Ser Gln Pro Arg Asp Arg Arg Gln Pr 50 55 60 Ile Pro Lys Ala Arg Gln Ser Asp Gly Xaa Xaa Trp Ala Gln Pro Gl 65 70 75 80 His Pro Trp Pro Leu Tyr Gly Asn Glu Gly Cys Gly Trp Ala Gly Tr 85 90 95 Leu Leu Ser Pro Arg Gly Ser Arg Pro Ser Trp Gly Pro 100 105 447 base pairs nucleic acid single linear cDNA NO NO 7 GACGGCGTGA ACTATGCAAC AGGGAATTTG CCTGGTTGCT CTTTCTCTAT CTTCCTCTTA 60 GCTTTTCTGT CCTGCTTGAC GGTTCCAACT ACCGCTCATG AGGTGCGCAA CGCATCCGGG 120 GTATATCATC TCACCAATGA CTGTTCCAAC TCGAGCATCA TCTATGAGAT GAGTGGTATG 180 ATCTTGCACG CCCCAGGGTG TGTGCCCTGC GTTCGGGAGA ACAACTCTTC TCGTTGCTGG 240 ATGCCRCTCA CCCCCACGCT TGCGGTCAAA GACGCTAATG TCCCTACTGC GGCAATCCGA 300 CGCCATGTCG ACTTGCTGGT TGGGACAGCC GCGTTTCGTT CCGCTATGTA CGTGGGGGAC 360 CTCTGCGGAT CCGTCTTCCT TGTCGGCCAG CTATTCACCT TTTCACCCCG CTTGTACCAT 420 ACAACACAGG AGTGCAACTG CTCAATC 447 149 amino acids amino acid linear peptide 8 Asp Gly Val Asn Tyr Ala Thr Gly Asn Leu Pro Gly Cys Ser Phe Se 1 5 10 15 Ile Phe Leu Leu Ala Phe Leu Ser Cys Leu Thr Val Pro Thr Thr Al 20 25 30 His Glu Val Arg Asn Ala Ser Gly Val Tyr His Leu Thr Asn Asp Cy 35 40 45 Ser Asn Ser Ser Ile Ile Tyr Glu Met Ser Gly Met Ile Leu His Al 50 55 60 Pro Gly Cys Val Pro Cys Val Arg Glu Asn Asn Ser Ser Arg Cys Tr 65 70 75 80 Met Xaa Leu Thr Pro Thr Leu Ala Val Lys Asp Ala Asn Val Pro Th 85 90 95 Ala Ala Ile Arg Arg His Val Asp Leu Leu Val Gly Thr Ala Ala Ph 100 105 110 Arg Ser Ala Met Tyr Val Gly Asp Leu Cys Gly Ser Val Phe Leu Va 115 120 125 Gly Gln Leu Phe Thr Phe Ser Pro Arg Leu Tyr His Thr Thr Gln Gl 130 135 140 Cys Asn Cys Ser Ile 145 223 base pairs nucleic acid single linear cDNA NO NO 9 ATGAGCACGA ATCCTAAACC TCAAAGAAAA ACCAAAAGAA ACACCAACCG CCGCCCACAG 60 GACGTCAAGT TCCCGGGCGG TGGCCAGATC GTTGGTGGAG TCTACGTGCT ACCGCGCAGG 120 GGCCCTAGAT TGGGTGTGCG CGCAGCGCGG AAGACTTCGG AGCGGTCGCA ACCTCGTGGG 180 AGGCGCCAAC CTATTCCCAA GGAGCGCCGA CCCGAGGGCA GGT 223 74 amino acids amino acid linear peptide 10 Met Ser Thr Asn Pro Lys Pro Gln Arg Lys Thr Lys Arg Asn Thr As 1 5 10 15 Arg Arg Pro Gln Asp Val Lys Phe Pro Gly Gly Gly Gln Ile Val Gl 20 25 30 Gly Val Tyr Val Leu Pro Arg Arg Gly Pro Arg Leu Gly Val Arg Al 35 40 45 Ala Arg Lys Thr Ser Glu Arg Ser Gln Pro Arg Gly Arg Arg Gln Pr 50 55 60 Ile Pro Lys Glu Arg Arg Pro Glu Gly Arg 65 70 957 base pairs nucleic acid single linear cDNA NO NO 11 ATGAGCACGA ATCCTAAACC TCAAAGAAAA ACCAAACGCA ACACCAACCG CCGCCCACAG 60 GACGTTAAAT TCCCGGGTGG GGGGCAGATC GTGGGTGGAG TTTACTTGTT GCCGCGCAGG 120 GGCCCCAGGT TGGGTGTGCG CGCGACGAGG AAGACTTCCG AGCGGTCGCA ACCTCGCGGA 180 AGGCGACAGC CTATCCCCAA GGCTCGCCGA CCCGAGGGCA GGTCCTGGGC TCAGCCTGGG 240 TACCCATGGC CCCTCTATGC TAACGAGGGC TGCGGATGGG CGGGATGGCT CCTGTCCCCT 300 CGCGGCTCCC GTCCTAGCTG GGGCCCCAAT GACCCCCGAC GTAGATCACG CAATTTGGGT 360 AAGGTCATCG ATACCCTAAC GTGTGGCTTC GCCGATCTCA TGGGGTACAT TCCGCTCGTC 420 GGCGCCCCCC TAGGGGGCGC TTCCAGAACC CTGNCACATG GTGTCCGGGT CCTGGNAGGC 480 GGCGTGATNN NNNNNNNNNN NAACCTTCCN GGTTGCTCTT TNNCTATCTT CCTCTTGGCN 540 TTACTCTCTT GCCTCACAGT CCCCACCTCT GCCTATGAGG TGCACAGCAC AACCGATGGC 600 TACCATGTCA CTAATGACTG TTCCAACGGC AGCATCGTAT ATGAGGCAAA GGACATCATC 660 CTTCACACGC CTGGGTGNGT GCCCTGCATA CGGGAAGGCA ATATCTCCCG TTGCTGGGTA 720 CCGCTCACCC CCACGCTCGC AGCGCGGATC GCGAACGCTC CCATCGATGA GGTGCGGCGT 780 CACGTCGACC TCCTCGTGGG GGCAGCCGTG TTCTGCTCAG CCATGTACAT TGGGGACCTT 840 TGTGGGGGCG TCTTCCTCGT TGGGCAATTG TTCACCTTCA CGTCCCGGCG GCATTGGACG 900 GTGCAGGACT GTAATTGTTC CATTTACTCT GGCCACATAA CGGGCCACCG NNNNNNN 957 319 amino acids amino acid linear peptide 12 Met Ser Thr Asn Pro Lys Pro Gln Arg Lys Thr Lys Arg Asn Thr As 1 5 10 15 Arg Arg Pro Gln Asp Val Lys Phe Pro Gly Gly Gly Gln Ile Val Gl 20 25 30 Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg Leu Gly Val Arg Al 35 40 45 Thr Arg Lys Thr Ser Glu Arg Ser Gln Pro Arg Gly Arg Arg Gln Pr 50 55 60 Ile Pro Lys Ala Arg Arg Pro Glu Gly Arg Ser Trp Ala Gln Pro Gl 65 70 75 80 Tyr Pro Trp Pro Leu Tyr Ala Asn Glu Gly Cys Gly Trp Ala Gly Tr 85 90 95 Leu Leu Ser Pro Arg Gly Ser Arg Pro Ser Trp Gly Pro Asn Asp Pr 100 105 110 Arg Arg Arg Ser Arg Asn Leu Gly Lys Val Ile Asp Thr Leu Thr Cy 115 120 125 Gly Phe Ala Asp Leu Met Gly Tyr Ile Pro Leu Val Gly Ala Pro Le 130 135 140 Gly Gly Ala Ser Arg Thr Leu Xaa His Gly Val Arg Val Leu Xaa Gl 145 150 155 160 Gly Val Xaa Xaa Xaa Xaa Xaa Asn Leu Xaa Gly Cys Ser Xaa Xaa Il 165 170 175 Phe Leu Leu Xaa Leu Leu Ser Cys Leu Thr Val Pro Thr Ser Ala Ty 180 185 190 Glu Val His Ser Thr Thr Asp Gly Tyr His Val Thr Asn Asp Cys Se 195 200 205 Asn Gly Ser Ile Val Tyr Glu Ala Lys Asp Ile Ile Leu His Thr Pr 210 215 220 Gly Xaa Val Pro Cys Ile Arg Glu Gly Asn Ile Ser Arg Cys Trp Va 225 230 235 240 Pro Leu Thr Pro Thr Leu Ala Ala Arg Ile Ala Asn Ala Pro Ile As 245 250 255 Glu Val Arg Arg His Val Asp Leu Leu Val Gly Ala Ala Val Phe Cy 260 265 270 Ser Ala Met Tyr Ile Gly Asp Leu Cys Gly Gly Val Phe Leu Val Gl 275 280 285 Gln Leu Phe Thr Phe Thr Ser Arg Arg His Trp Thr Val Gln Asp Cy 290 295 300 Asn Cys Ser Ile Tyr Ser Gly His Ile Thr Gly His Xaa Xaa Xaa 305 310 315 310 base pairs nucleic acid single linear cDNA NO NO 13 ATGAGCACAA ATCCTAAACC TCAAAGAAAA ACCAAAAGAA ATACCAACCG CCGCCCACAG 60 GACGTCAAGT TCCCGGGCGG CGGCCAGATC GTTGGCGGAG TTTACTTGTT GCCGCGCAGG 120 GGCCCCAGAT TGGGTGTGCG CGCGACGAGA AAGACTTCTG AACGGTCCCA GCCACGTGGA 180 AGGCGCCAGC CCATCCCTAA AGATCGGNGN GCCACTGGCA GGTCCTGGGG ACGTCCAGGA 240 TATCCCTGGC CCCTGTATGG GAACGAGGGG CTCGGCTGGG CAGGATGGCT CCTGTCCCCC 300 CGAGGCTCTC 310 108 amino acids amino acid linear peptide 14 Met Ser Thr Asn Pro Lys Pro Gln Arg Lys Thr Lys Arg Asn Thr As 1 5 10 15 Arg Arg Pro Gln Asp Val Lys Phe Pro Gly Gly Gly Gln Ile Val Gl 20 25 30 Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg Leu Gly Val Arg Al 35 40 45 Thr Arg Lys Thr Ser Glu Arg Ser Gln Pro Arg Gly Arg Arg Gln Pr 50 55 60 Ile Pro Lys Asp Arg Xaa Ala Thr Gly Arg Ser Trp Gly Arg Pro Gl 65 70 75 80 Tyr Pro Trp Pro Leu Tyr Gly Asn Glu Gly Leu Gly Trp Ala Gly Tr 85 90 95 Leu Leu Ser Pro Arg Gly Ser Arg Pro Ser Trp Gly 100 105 579 base pairs nucleic acid single linear cDNA NO NO 15 ACGTGCGGNT NTGCCGACCT CATGGGGTAC ATNCCCGTTG TCGGCGCCCC GGTGGGCGGG 60 GTNGCCAGGG CCCTCGCGNA TGGCGTGCGG GTCCTGGAGG ACGGGATAAA TTATGNAACA 120 GGGAACCTCC CTGGTTGCTC CTTTTCTATC TTCTNGTTGG CTCTTCTGTC TTGTGTCACC 180 GTGCCTGTCT CTGNCGTTGA GGTCAAAAAT ACCAGTCAGG CCTATATGGC AACCAACGAC 240 TGCTCCAACA ACAGCATCGT ATGGCAATTG GNGGACGCGG TGCTTCATGT TCCTGGATGT 300 GTCCCCTGCG AGAATAGCTC CGGTCGGTTC CACTGTTGGA TCCCGATCTC GCCCAACATA 360 GCCGTGAGCA AACCTGGTGC TCTCACCAAG GGACTGCGGG CACGCATTGA TGCCGTCGTG 420 ATGTCCGCCA CCCTCTGCTC TGCCCTGTAC GTGGGAGATG TGTGCGGCGC AGTGATGATA 480 GCTGCACAGG CTTTCATCGT GGCACCGAAG CGCCATTACT TCGTCCAGGA ATGCAATTGC 540 TCCATATACC CAGGCCACAT TACAGGTCAT CGCATGGCG 579 193 amino acids amino acid linear peptide 16 Thr Cys Xaa Xaa Ala Asp Leu Met Gly Tyr Xaa Pro Val Val Gly Al 1 5 10 15 Pro Val Gly Gly Xaa Ala Arg Ala Leu Ala Xaa Gly Val Arg Val Le 20 25 30 Glu Asp Gly Ile Asn Tyr Xaa Thr Gly Asn Leu Pro Gly Cys Ser Ph 35 40 45 Ser Ile Phe Xaa Leu Ala Leu Leu Ser Cys Val Thr Val Pro Val Se 50 55 60 Xaa Val Glu Val Lys Asn Thr Ser Gln Ala Tyr Met Ala Thr Asn As 65 70 75 80 Cys Ser Asn Asn Ser Ile Val Trp Gln Leu Xaa Asp Ala Val Leu Hi 85 90 95 Val Pro Gly Cys Val Pro Cys Glu Asn Ser Ser Gly Arg Phe His Cy 100 105 110 Trp Ile Pro Ile Ser Pro Asn Ile Ala Val Ser Lys Pro Gly Ala Le 115 120 125 Thr Lys Gly Leu Arg Ala Arg Ile Asp Ala Val Val Met Ser Ala Th 130 135 140 Leu Cys Ser Ala Leu Tyr Val Gly Asp Val Cys Gly Ala Val Met Il 145 150 155 160 Ala Ala Gln Ala Phe Ile Val Ala Pro Lys Arg His Tyr Phe Val Gl 165 170 175 Glu Cys Asn Cys Ser Ile Tyr Pro Gly His Ile Thr Gly His Arg Me 180 185 190 Ala 957 base pairs nucleic acid single linear cDNA NO NO 17 ATGAGCACAA ATCCTAAACC TCAAAGAAAA ACTAAAAGAA ACACTAACCG TCGCCCACAG 60 GACGTTAAGT TCCCGGGCGG CGGCCAGATC GTTGGCGGAG TTTACTTGTT GCCGCGCAGG 120 GGCCCCAGGT TGGGTGTGCG CGCGCCAAGG AAGACTTCTG AACGGTCCCA GCCACGTGGA 180 AGGCGCCAGC CCATCCCAAA AGATCGGCGC GCCACTGGCA AGTCCTGGGG ACGTCCAGGA 240 TACCCTTGGC CCCTGTACGG GAACGAGGGC CTCGGCTGGG CAGGGTGGCT CCTGTCCCCC 300 CGGGGCTCTC GCCCCTCGTG GGGCCCAAAC GACCCCCGGC ACAGGTCACG CAACTTGGGT 360 AAGGTCATCG ATACCCTCAC GTGTGGCTTT GSCGACCTCA TGGGGTACAT ACCTGTCGTC 420 GGCGCCCCTG TGGGCGGCGT TGCCAGAGCC CTCGCGCATG GCGTGCGGGT CCTGGAGGAC 480 GGGATAAATT ATGCAACAGG GAACTTGCCC GGTTGCTCCT TTTCTATCTT CTTGCTGGCT 540 CTCTTGTCTT GTATCACCGT GCCCGTGTCT GCCATACAGG TTAAGAACAA CAGCCACTTC 600 TACATGGCGA CTAATGACTG TGCCAATGAC AGCATCGTCT GGCAGCTCAG GGACGCGGTG 660 CTCCATGTTC CTGGATGTGT CCCCTGTGAG AGGTCAGGTA ATAGGACCTT CTGTTGGACA 720 GCGGTCTCGC CCAACGTGGC TGTGAGCCGA CCTGGTGCTC TCACTAGAGG TCTGCGGGCT 780 CACATTGATA CCATCGTGAT GTCCGCCACC CTCTGCTCTG CCCTATACAT AGGGGACCTA 840 TGCGGCGCTG TGATGATAGC AGCGCAAGTT GCCGTCGTCT CACCGCAATA CCATACTTTT 900 GTCCAGGAAT GCAACTGCTC CATATACCCA GGCCATATCA CAGGACATCG AATGGNN 957 319 amino acids amino acid linear peptide 18 Met Ser Thr Asn Pro Lys Pro Gln Arg Lys Thr Lys Arg Asn Thr As 1 5 10 15 Arg Arg Pro Gln Asp Val Lys Phe Pro Gly Gly Gly Gln Ile Val Gl 20 25 30 Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg Leu Gly Val Arg Al 35 40 45 Pro Arg Lys Thr Ser Glu Arg Ser Gln Pro Arg Gly Arg Arg Gln Pr 50 55 60 Ile Pro Lys Asp Arg Arg Ala Thr Gly Lys Ser Trp Gly Arg Pro Gl 65 70 75 80 Tyr Pro Trp Pro Leu Tyr Gly Asn Glu Gly Leu Gly Trp Ala Gly Tr 85 90 95 Leu Leu Ser Pro Arg Gly Ser Arg Pro Ser Trp Gly Pro Asn Asp Pr 100 105 110 Arg His Arg Ser Arg Asn Leu Gly Lys Val Ile Asp Thr Leu Thr Cy 115 120 125 Gly Phe Xaa Asp Leu Met Gly Tyr Ile Pro Val Val Gly Ala Pro Va 130 135 140 Gly Gly Val Ala Arg Ala Leu Ala His Gly Val Arg Val Leu Glu As 145 150 155 160 Gly Ile Asn Tyr Ala Thr Gly Asn Leu Pro Gly Cys Ser Phe Ser Il 165 170 175 Phe Leu Leu Ala Leu Leu Ser Cys Ile Thr Val Pro Val Ser Ala Il 180 185 190 Gln Val Lys Asn Asn Ser His Phe Tyr Met Ala Thr Asn Asp Cys Al 195 200 205 Asn Asp Ser Ile Val Trp Gln Leu Arg Asp Ala Val Leu His Val Pr 210 215 220 Gly Cys Val Pro Cys Glu Arg Ser Gly Asn Arg Thr Phe Cys Trp Th 225 230 235 240 Ala Val Ser Pro Asn Val Ala Val Ser Arg Pro Gly Ala Leu Thr Ar 245 250 255 Gly Leu Arg Ala His Ile Asp Thr Ile Val Met Ser Ala Thr Leu Cy 260 265 270 Ser Ala Leu Tyr Ile Gly Asp Leu Cys Gly Ala Val Met Ile Ala Al 275 280 285 Gln Val Ala Val Val Ser Pro Gln Tyr His Thr Phe Val Gln Glu Cy 290 295 300 Asn Cys Ser Ile Tyr Pro Gly His Ile Thr Gly His Arg Met Xaa 305 310 315 447 base pairs nucleic acid single linear cDNA NO NO 19 GACGGGGTAA ATTATGCAAC AGGGAATCTG CCTGGTTGCT CTTTCTCTAT CTTCTTGTTG 60 GCTCTTCTGT CTTGTGTCAC CGTGCCTGTC TCTGCCGTGC AGGTTAAGAA CACCAGTACC 120 ATGTACATGG CAACCAATGA CTGTTCCAAC AACAGCATCA TCTGGCAAAT GCAGGGCGCG 180 GTGCTTCATG TTCCTGGATG TGTCCCGTGT GAGTTGCAGG GCAATAAGTC CCGGTGCTGG 240 ATACCGGTCA CTCCCAACGT GGCTGTGAAC CAGCCCGGCG CCCTCACTAG GGGCTTGCGG 300 ACGCACATTG ACACCATCGT GATGGTCGCT ACGCTCTGTT CTGCACTCTA CATCGGGGAC 360 GTGTGTGGCG CGGTGATGAT AGCTGCTCAG GTTGTCATTG TCTCGCCGCA ACATCACAAC 420 TTTTCCCAGG ATTGCAATTG TTCCATC 447 149 amino acids amino acid linear peptide 20 Asp Gly Val Asn Tyr Ala Thr Gly Asn Leu Pro Gly Cys Ser Phe Se 1 5 10 15 Ile Phe Leu Leu Ala Leu Leu Ser Cys Val Thr Val Pro Val Ser Al 20 25 30 Val Gln Val Lys Asn Thr Ser Thr Met Tyr Met Ala Thr Asn Asp Cy 35 40 45 Ser Asn Asn Ser Ile Ile Trp Gln Met Gln Gly Ala Val Leu His Va 50 55 60 Pro Gly Cys Val Pro Cys Glu Leu Gln Gly Asn Lys Ser Arg Cys Tr 65 70 75 80 Ile Pro Val Thr Pro Asn Val Ala Val Asn Gln Pro Gly Ala Leu Th 85 90 95 Arg Gly Leu Arg Thr His Ile Asp Thr Ile Val Met Val Ala Thr Le 100 105 110 Cys Ser Ala Leu Tyr Ile Gly Asp Val Cys Gly Ala Val Met Ile Al 115 120 125 Ala Gln Val Val Ile Val Ser Pro Gln His His Asn Phe Ser Gln As 130 135 140 Cys Asn Cys Ser Ile 145 310 base pairs nucleic acid single linear cDNA NO NO 21 ATGAGCACAA ATCCTAAACC TCAAAGAAAA ACCAAAAGAA ACACTAACCG CCGCCCACAG 60 GACGTTAAGT TCCCGGGCGG TGGCCAGATC GTTGGCGGAG TATACTTGTT GCCGCGCAGG 120 GGCCCCCGGT TGGGTGTGCG CGCGACGAGG AAAACTTCCG AACGGTCCCA GCCACGTGGG 180 AGGCGCCAGC CCATCCCTAA AGATCGGCGC TCCACTGGCA AATCCTGGGG ACGTCCAGGA 240 TACCCTTGGC CCCTGTATGG GAACGAGGGC CTTGGTTGGG CAGGATGGCT CTTGTCCCCT 300 CGAGGCTCTC 310 48 amino acids amino acid linear peptide 22 Met Ser Thr Asn Pro Lys Pro Gln Arg Lys Thr Lys Arg Asn Thr As 1 5 10 15 Arg Arg Pro Gln Asp Val Lys Phe Pro Gly Gly Gly Arg Ser Leu Al 20 25 30 Glu Tyr Thr Cys Ala Arg Arg Gly Lys Leu Arg Arg Ser Ser Met Gl 35 40 45 447 base pairs nucleic acid single linear cDNA NO NO 23 GACGGGATAA ACTACGCAAC AGGGAATCTG CCCGGTTGCT CCTTTTCTAT CTTCTTGCTG 60 GCCTTGCTAT CCTGTCTCAC TGTGCCGGCG TCCGCTGTGC AGGTCAAGAA CACCAGCCAC 120 TCTTATATGG TGACCAATGA TTGCTCAAAC AGCAGCATTG TCTGGCAGCT TAAGGATGCT 180 GTGCTTCACG TCCCTGGATG TGTTCCATGT GAGAGGCACC AAAATCAGTC TCGCTGCTGG 240 ATACCTGTGA CACCCAATGT GGCCGTGAGC CAACCTGGCG CGCTCACCAG GGGTTTGCGG 300 ACGCACATTG ACACCATCGT TGCGTCTGCT ACCGTCTGCT CAGCTTTGTA TGTGGGCGAC 360 TTCTGCGGCG CAGTGATGTT GGTCTCTCAA TTTTTCATGA TCTCCCCTCA GCACCACATC 420 TTCGTCCAGG ATTGCAACTG CTCGATA 447 149 amino acids amino acid linear peptide 24 Asp Gly Ile Asn Tyr Ala Thr Gly Asn Leu Pro Gly Cys Ser Phe Se 1 5 10 15 Ile Phe Leu Leu Ala Leu Leu Ser Cys Leu Thr Val Pro Ala Ser Al 20 25 30 Val Gln Val Lys Asn Thr Ser His Ser Tyr Met Val Thr Asn Asp Cy 35 40 45 Ser Asn Ser Ser Ile Val Trp Gln Leu Lys Asp Ala Val Leu His Va 50 55 60 Pro Gly Cys Val Pro Cys Glu Arg His Gln Asn Gln Ser Arg Cys Tr 65 70 75 80 Ile Pro Val Thr Pro Asn Val Ala Val Ser Gln Pro Gly Ala Leu Th 85 90 95 Arg Gly Leu Arg Thr His Ile Asp Thr Ile Val Ala Ser Ala Thr Va 100 105 110 Cys Ser Ala Leu Tyr Val Gly Asp Phe Cys Gly Ala Val Met Leu Va 115 120 125 Ser Gln Phe Phe Met Ile Ser Pro Gln His His Ile Phe Val Gln As 130 135 140 Cys Asn Cys Ser Ile 145 356 base pairs nucleic acid single linear cDNA NO NO 25 GACGGGATAA ACTATGCAAC AGGGAACCTG CCTGGTTGCT CCTTTTCTAT CTTCTTACTG 60 GCCCTGCTTT CTTGCATCAC CGTGCCGGTC TCTGCCGTGC AAGTTGCGAA CCGCAGTGGT 120 TCTTACATGG TGACCAATGA TTGCTCGAAC AGCAGCATCG TTTGGCAGCT CGAGGAGGCC 180 GTCCTTCACG TCCCTGGATG TGTTCCCTGT GAGTGGAAGG ACAACACCTC CCGCTGCTGG 240 ATACCGGTCA CCCCTAACAT CGCTGTGAGC CAACCTGGCG CGCTTACCAA GGGCCTGCGG 300 ACACATATTG ACATCATTGT CGCGTCCGCC ACGTTCTGCT CTGCCTTGTA TGTGGG 356 118 amino acids amino acid linear peptide 26 Asp Gly Ile Asn Tyr Ala Thr Gly Asn Leu Pro Gly Cys Ser Phe Se 1 5 10 15 Ile Phe Leu Leu Ala Leu Leu Ser Cys Ile Thr Val Pro Val Ser Al 20 25 30 Val Gln Val Ala Asn Arg Ser Gly Ser Tyr Met Val Thr Asn Asp Cy 35 40 45 Ser Asn Ser Ser Ile Val Trp Gln Leu Glu Glu Ala Val Leu His Va 50 55 60 Pro Gly Cys Val Pro Cys Glu Trp Lys Asp Asn Thr Ser Arg Cys Tr 65 70 75 80 Ile Pro Val Thr Pro Asn Ile Ala Val Ser Gln Pro Gly Ala Xaa Th 85 90 95 Lys Gly Leu Arg Thr His Ile Asp Ile Ile Val Ala Ser Ala Thr Ph 100 105 110 Cys Ser Ala Leu Tyr Val 115 310 base pairs nucleic acid single linear cDNA NO NO 27 ATGAGCACGA ATCCTAAACC TCAAAGAAAA ACCAAACGTA ACACCAACCG CCGCCCCATG 60 GACGTTAAGT TCCCGGGTGG TGGCCAGATC GTTGGCGGAG TTTACTTGTT GCCGCGCAGG 120 GGCCCCAGGT TGGGTGTGCG CGCGACTCGG AAGACTTCGG AGCGGTCGCA ACCTCGTGGG 180 AGACGCCAAC CTATCCCCAA GGCGCGTCGA TCCGAGGGAA GGTCCTGGGC ACAGCCAGGA 240 TATCCATGGC CTCTTTACGG TAATGAGGGT TGCGGGTGGG CANNATGGCT CTTGTCCCCC 300 CGCGGTTCTC 310 117 amino acids amino acid linear peptide 28 Met Ser Thr Asn Pro Lys Pro Gln Arg Lys Thr Lys Arg Asn Thr As 1 5 10 15 Arg Arg Pro Met Asp Val Lys Phe Pro Gly Gly Gly Gln Ile Val Gl 20 25 30 Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg Leu Gly Val Arg Al 35 40 45 Thr Arg Lys Thr Ser Glu Arg Ser Gln Pro Arg Gly Arg Arg Gln Pr 50 55 60 Ile Pro Lys Ala Arg Arg Ser Glu Gly Arg Ser Trp Ala Gln Pro Gl 65 70 75 80 Tyr Pro Trp Pro Leu Tyr Gly Asn Glu Gly Cys Gly Trp Ala Xaa Tr 85 90 95 Leu Leu Ser Pro Arg Gly Ser Arg Pro Ser Trp Gly Pro Asn Asp Pr 100 105 110 Arg Arg Arg Ser Arg 115 447 base pairs nucleic acid single linear cDNA NO NO 29 GACGGGATCA ATTTTGCAAC AGGGAACCTC CCCGGTTGCT CCTTTTCTAT CTTCCTCTTG 60 GCACTCCTCT CGTGCCTGAC TGTCCCCGCT TCGGCCATCA ACTATCGCAA TGTCTCGGGC 120 ATTTACTATG TCACCAATGA TTGCCCGAAT TCAAGCATAG TGTATGAGGC CGACCATCAC 180 ATCTTGCACC TCCCAGGTTG CGTGCCCTGC GTGAGAGAGG GGAATCAGTC ACGTTGCTGG 240 GTAGCCCTTA CCCCTACCGT CGCAGCGCCA TACATCGGCG CGCCACTTGA GTCTCTACGG 300 AGTCATGTGG ACTTGATGGT GGGGGCCGCC ACTGTTTGTT CAGCCCTTTA CATCGGGGAT 360 TTRTGTGGYG GCTTGTTCCT AGTCGGTCAG ATGTTCTCTT TCCGACCAAG GCGCCACTGG 420 ACTACTCAAG ATTGCAATTG TTCCATC 447 149 amino acids amino acid linear peptide 30 Asp Gly Ile Asn Phe Ala Thr Gly Asn Leu Pro Gly Cys Ser Phe Se 1 5 10 15 Ile Phe Leu Leu Ala Leu Leu Ser Cys Leu Thr Val Pro Ala Ser Al 20 25 30 Ile Asn Tyr Arg Asn Val Ser Gly Ile Tyr Tyr Val Thr Asn Asp Cy 35 40 45 Pro Asn Ser Ser Ile Val Tyr Glu Ala Asp His His Ile Leu His Le 50 55 60 Pro Gly Cys Val Pro Cys Val Arg Glu Gly Asn Gln Ser Arg Cys Tr 65 70 75 80 Val Ala Leu Thr Pro Thr Val Ala Ala Pro Tyr Ile Gly Ala Pro Le 85 90 95 Glu Ser Leu Arg Ser His Val Asp Leu Met Val Gly Ala Ala Thr Va 100 105 110 Cys Ser Ala Leu Tyr Ile Gly Asp Xaa Cys Xaa Gly Leu Phe Leu Va 115 120 125 Gly Gln Met Phe Ser Phe Arg Pro Arg Arg His Trp Thr Thr Gln As 130 135 140 Cys Asn Cys Ser Ile 145 447 base pairs nucleic acid single linear cDNA NO NO 31 GACGGGATCA ATTATGCAAC AGGGAACCTT CCCGGTTGCT CTTTTTCTAT CTTCCTCTTG 60 GCACTCCTCT CGTGCCTGAC TGTTCCCGCT TCGGCCATTA ACTACCGCAA CACCTCGGGC 120 ATCTACCACG TCACCAATGA CTGCCCGAAC TCGAGCATAG TTTATGAGGC CGACCACCAC 180 ATCTTGCACC TTCCAGGTTG CGTGCCCTGC GTGAGAACTG GGAATCAGTC ACGTTGCTGG 240 GTGGCCCTTA CTCCTACCGT CGCAGCGCCA TACATCGGCG CACCGCTTGA GTCTCTGCGG 300 AGTCATGTGG ATCTGATGGT GGGGGCTGCC ACTGTTTGCT CAGCCCTTTA CATCGGGGAT 360 TTGTGTGGCG GCTTGTTCTT GGTTGGTCAG ATGTTTTCTT TCCGACCACG ACGCCACTGG 420 ACTGCCCAGG ATTGCAATTG TTCTATC 447 149 amino acids amino acid linear peptide 32 Asp Gly Ile Asn Tyr Ala Thr Gly Asn Leu Pro Gly Cys Ser Phe Se 1 5 10 15 Ile Phe Leu Leu Ala Leu Leu Ser Cys Leu Thr Val Pro Ala Ser Al 20 25 30 Ile Asn Tyr Arg Asn Thr Ser Gly Ile Tyr His Val Thr Asn Asp Cy 35 40 45 Pro Asn Ser Ser Ile Val Tyr Glu Ala Asp His His Ile Leu His Le 50 55 60 Pro Gly Cys Val Pro Cys Val Arg Thr Gly Asn Gln Ser Arg Cys Tr 65 70 75 80 Val Ala Leu Thr Pro Thr Val Ala Ala Pro Tyr Ile Gly Ala Pro Le 85 90 95 Glu Ser Leu Arg Ser His Val Asp Leu Met Val Gly Ala Ala Thr Va 100 105 110 Cys Ser Ala Leu Tyr Ile Gly Asp Leu Cys Gly Gly Leu Phe Leu Va 115 120 125 Gly Gln Met Phe Ser Phe Arg Pro Arg Arg His Trp Thr Ala Gln As 130 135 140 Cys Asn Cys Ser Ile 145 447 base pairs nucleic acid single linear cDNA NO NO 33 GACGGGATTA ATTATGCAAC AGGGAATCTT CCCGGTTGCT CCTTTTCTAT CTTCCTCTTG 60 GCACTTCTCT CGTGCCTGAC TGTCCCCGCT TCGGCCATTA ACTACCACAA CACCTCGGGC 120 ATCTATCATA TCACCAACGA CTGCCCGAAT TCAAGCATAG TGTATGAGGC CGACCATCAC 180 ATCTTGCATC TCCCAGGTTG CGTGCCCTGC GTGAGAGTGG GGAATCAGTC GAGTTGCTGG 240 GTGGCCCTTA CCCCTACCAT CGCAGCGCCA TACATCGGCG CACCGCTTGA GTCCTTGCGG 300 AGTCATGTGG ATCTGATGGT GGGGGCGGCC ACTGTCTGTT CAGCCCTTTA CATCGGGGAT 360 TTGTGTGGCG GTGCGTTCTT GGTTGGTCAG ATGTTCTCTT TCCGACCACG GCGCCACTGG 420 ACCACCCAAG ATTGCAACTG CTCCATC 447 149 amino acids amino acid linear peptide 34 Asp Gly Ile Asn Tyr Ala Thr Gly Asn Leu Pro Gly Cys Ser Phe Se 1 5 10 15 Ile Phe Leu Leu Ala Leu Leu Ser Cys Leu Thr Val Pro Ala Ser Al 20 25 30 Ile Asn Tyr His Asn Thr Ser Gly Ile Tyr His Ile Thr Asn Asp Cy 35 40 45 Pro Asn Ser Ser Ile Val Tyr Glu Ala Asp His His Ile Leu His Le 50 55 60 Pro Gly Cys Val Pro Cys Val Arg Val Gly Asn Gln Ser Ser Cys Tr 65 70 75 80 Val Ala Leu Thr Pro Thr Ile Ala Ala Pro Tyr Ile Gly Ala Pro Le 85 90 95 Glu Ser Leu Arg Ser His Val Asp Leu Met Val Gly Ala Ala Thr Va 100 105 110 Cys Ser Ala Leu Tyr Ile Gly Asp Leu Cys Gly Gly Ala Phe Leu Va 115 120 125 Gly Gln Met Phe Ser Phe Arg Pro Arg Arg His Trp Thr Thr Gln As 130 135 140 Cys Asn Cys Ser Ile 145 447 base pairs nucleic acid single linear cDNA NO NO 35 GACGGGATCA ATTATGCAAC AGGGAATATT CCCGGTTGCT CYTTTTCTAT CTTCCTTYTG 60 GCACTTCTCT CGTGTCTGAC TGTCCCCGCT TCGGCCACTA ACTATCGCAA CGTCTCGGGC 120 ATCTACCATG TCACCAATGA CTGCCCGAAT TCAAGCATAG TGTATGAGGC CGACCATCAC 180 ATCTTAGCAC TTCCAGGTTG CGTGCCCTGC GTGAGAGTGG GGAACCAGTC ACGCTGCTGG 240 GTGGCCCTTA CCCCTACCGT CGCAGCGCCA TACACCGCGG CGCCGCTTGA GTCCCTGCGG 300 AGTCATGTGG ATCTGATGGT GGGAGCTGCC ACTGTTTGTT CAGCCCTTTA CATCGGGGAY 360 TTGTGTGGCG GCTTGTTCTT GGTTGGTCAG ATGTTCTCTT TYCAGCCTCG GCGCCACTGG 420 ACTACCCAGG ATTGCAATTG TTCCATC 447 149 amino acids amino acid linear peptide 36 Asp Gly Ile Asn Tyr Ala Thr Gly Asn Ile Pro Gly Cys Xaa Phe Se 1 5 10 15 Ile Phe Leu Xaa Ala Leu Leu Ser Cys Leu Thr Val Pro Ala Ser Al 20 25 30 Thr Asn Tyr Arg Asn Val Ser Gly Ile Tyr His Val Thr Asn Asp Cy 35 40 45 Pro Asn Ser Ser Ile Val Tyr Glu Ala Asp His His Ile Leu Ala Le 50 55 60 Pro Gly Cys Val Pro Cys Val Arg Val Gly Asn Gln Ser Arg Cys Tr 65 70 75 80 Val Ala Leu Thr Pro Thr Val Ala Ala Pro Tyr Thr Ala Ala Pro Le 85 90 95 Glu Ser Leu Arg Ser His Val Asp Leu Met Val Gly Ala Ala Thr Va 100 105 110 Cys Ser Ala Leu Tyr Ile Gly Xaa Leu Cys Gly Gly Leu Phe Leu Va 115 120 125 Gly Gln Met Phe Ser Xaa Gln Pro Arg Arg His Trp Thr Thr Gln As 130 135 140 Cys Asn Cys Ser Ile 145 447 base pairs nucleic acid single linear cDNA NO NO 37 GACGGGATTA ATTATGCAAC AGGGAAYCTC CCCGGTTGCT CTTTTTCTAT CTTCCTCTTG 60 GCACTTCTCT CGTGCCTGAC TGTCCCCGCT TCGGCCACCA ACTACCGCAA TGTCTCGGGC 120 ATTTACCATG TCACCAATGA CTGCCCGAAT TCAAGCATAG TGTTTGAGGC CGACCATCAC 180 ATCTTGCACC TTCCAGGATG CGTGCCCTGC GTGAAAGAGG GAAATCATTC ACGCTGCTGG 240 GTGGCCCTTA CCCCTACCGT CGCAGCGCCA TACATCGGCG CGCCACTTGA GTCTCTACGG 300 AGTCATGTGG ATGTGATGGT GGGGGCTGCC ACTGTTTGTT CAGCCCTTTA CATCGGGGAT 360 CTGTGCGGTG GCTTGTTCCT GGTTGGTCAG ATGTTCTCTT TCCGACCACG GCGCCACTGG 420 ACTACCCAGG AATGCAATTG TTCCATC 447 149 amino acids amino acid linear peptide 38 Asp Gly Ile Asn Tyr Ala Thr Gly Xaa Leu Pro Gly Cys Ser Phe Se 1 5 10 15 Ile Phe Leu Leu Ala Leu Leu Ser Cys Leu Thr Val Pro Ala Ser Al 20 25 30 Thr Asn Tyr Arg Asn Val Ser Gly Ile Tyr His Val Thr Asn Asp Cy 35 40 45 Pro Asn Ser Ser Ile Val Phe Glu Ala Asp His His Ile Leu His Le 50 55 60 Pro Gly Cys Val Pro Cys Val Lys Glu Gly Asn His Ser Arg Cys Tr 65 70 75 80 Val Ala Leu Thr Pro Thr Val Ala Ala Pro Tyr Ile Gly Ala Pro Le 85 90 95 Glu Ser Leu Arg Ser His Val Asp Val Met Val Gly Ala Ala Thr Va 100 105 110 Cys Ser Ala Leu Tyr Ile Gly Asp Leu Cys Gly Gly Leu Phe Leu Va 115 120 125 Gly Gln Met Phe Ser Phe Arg Pro Arg Arg His Trp Thr Thr Gln Gl 130 135 140 Cys Asn Cys Ser Ile 145 447 base pairs nucleic acid single linear cDNA NO NO 39 GACGGGATCA ATTATGCAAC AGGGAACCTC CCCGGTTGCT CTTTCTCTAT CTTCATCCTG 60 GCACTTCTCT CGTGCCTGAC TGTCCCGGCC TCGGCTCAGC ATTATCGGAA TGTCTCGGGC 120 ATTTACCACG TCACCAACGA CTGCCCGAAC TCCAGCATAG TGTATGAGTC CGACCATCAC 180 ATCTTACACC TACCAGGGTG TGTACCCTGT GTGAAGACTG GGAACACTTC GCGCTGCTGC 240 GTGGCCTTAA CACCTACCGT GGCCGCGCCC ATACTTTCGG CTCCACTTAT GTCCGTACGG 300 CGGCATGTGG ATCTGATGGT GGGTGCAGCT ACCCTATCGT CTGCCCTCTA CGTTGGAGAG 360 CTCTGCGGGG GTGCCTTCCT AGTGGGGCAG ATGTTCACCT TCCAGCCGCG TCGCCACTGG 420 ACTGTCCAAG ACTGCAACTG TTCCATC 447 149 amino acids amino acid linear peptide 40 Asp Gly Ile Asn Tyr Ala Thr Gly Asn Leu Pro Gly Cys Ser Phe Se 1 5 10 15 Ile Phe Ile Leu Ala Leu Leu Ser Cys Leu Thr Val Pro Ala Ser Al 20 25 30 Gln His Tyr Arg Asn Val Ser Gly Ile Tyr His Val Thr Asn Asp Cy 35 40 45 Pro Asn Ser Ser Ile Val Tyr Glu Ser Asp His His Ile Leu His Le 50 55 60 Pro Gly Cys Val Pro Cys Val Lys Thr Gly Asn Thr Ser Arg Cys Tr 65 70 75 80 Val Ala Leu Thr Pro Thr Val Ala Ala Pro Ile Leu Ser Ala Pro Le 85 90 95 Met Ser Val Arg Arg His Val Asp Leu Met Val Gly Ala Ala Thr Le 100 105 110 Ser Ser Ala Leu Tyr Val Gly Asp Leu Cys Gly Gly Ala Phe Leu Va 115 120 125 Gly Gln Met Phe Thr Phe Gln Pro Arg Arg His Trp Thr Val Gln As 130 135 140 Cys Asn Cys Ser Ile 145 957 base pairs nucleic acid single linear cDNA NO NO 41 ATGAGCACAC TTCCAAAACC CCAAAGAAAA ACCAAAAGAA ATACTAACCG TCGCCCTATG 60 GACGTCAAGT TCCCGGGCGG CGGCCAGATC GTTGGTGGAG TTTACTTGTT GCCGCGCAGG 120 GGCCCTCGTT TGGGTGTGCG CGCGACGAGA AAGACCTCCG AACGGTCCCA GCCTAGAGGC 180 AGGCGCCAGC CCATACCAAA GGTACGCCAG CCGACAGGCC GTAGCTGGGG TCAACCCGGC 240 TACCCTTGGC CCCTTTATGG CAACGAGGGC TGCGGATGGG CGGGATGGCT CCTGTCCCCC 300 CGCGGGTCTC GTCCTAATTG GGGCCCCAAC GACCCCCGGC GAAGGTCCCG CAACTTGGGT 360 AAGGTCATCG ATACCCTTAC ATNCGGNCTA GCCGACCTCA TGGGGTACAT CCCTGTCCTA 420 GGAGGGCCGC TTGGCGGCGT TGCGGCTGCC CTGGCGCATG GCGTTAGGGC AATCGAGGAC 480 GGGGTCAATT ACGCAACAGG GAATCTTCCT GGTTGCTCCT TTTCTATCTT CCTCTTAGCA 540 CTGTTATCGT GCCTCACTAC ACCAGCCTCA GCAATTCAAG TCAAGAACGC CTCTGGGATC 600 TACCATCTTA CCAATGACTG CTCGAACAAC AGCATCGTTT TTGAGGCGGA GACCATGATA 660 CTGCATCTTC CAGGTTGTGT CCCATGTATC AAGGCGGGGA ATGAGTCACG ATGTTGGCTC 720 CCTGTCTCCC CCACCTTAGC CGTCCCCAAC TCATCAGTGC CAATCCACGG GTTTCGCCGA 780 CACGTAGACC TCCTCGTTGG GGCAGCGGCA TTTTGTTCGG CCATGTACAT CGGAGACCTC 840 TGTGGTAGCA TAATCTTGGT AGGGCAGCTT TTTACTTTCA GGCCTAAGTA CCATCAGGTT 900 ACCCAGGATT GTAACTGCTC TATNAACNCT GGCCACGTCA CGGGACACAG GATGGCA 957 319 amino acids amino acid linear peptide 42 Met Ser Thr Leu Pro Lys Pro Gln Arg Lys Thr Lys Arg Asn Thr As 1 5 10 15 Arg Arg Pro Met Asp Val Lys Phe Pro Gly Gly Gly Gln Ile Val Gl 20 25 30 Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg Leu Gly Val Arg Al 35 40 45 Thr Arg Lys Thr Ser Glu Arg Ser Gln Pro Arg Gly Arg Arg Gln Pr 50 55 60 Ile Pro Lys Val Arg Gln Pro Thr Gly Arg Ser Trp Gly Gln Pro Gl 65 70 75 80 Tyr Pro Trp Pro Leu Tyr Gly Asn Glu Gly Cys Gly Trp Ala Gly Tr 85 90 95 Leu Leu Ser Pro Arg Gly Ser Arg Pro Asn Trp Gly Pro Asn Asp Pr 100 105 110 Arg Arg Arg Ser Arg Asn Leu Gly Lys Val Ile Asp Thr Leu Thr Xa 115 120 125 Xaa Leu Ala Asp Leu Met Gly Tyr Ile Pro Val Leu Gly Gly Pro Le 130 135 140 Gly Gly Val Ala Ala Ala Leu Ala His Gly Val Arg Ala Ile Glu As 145 150 155 160 Gly Val Asn Tyr Ala Thr Gly Asn Leu Pro Gly Cys Ser Phe Ser Il 165 170 175 Phe Leu Leu Ala Leu Leu Ser Cys Leu Thr Thr Pro Ala Ser Ala Il 180 185 190 Gln Val Lys Asn Ala Ser Gly Ile Tyr His Leu Thr Asn Asp Cys Se 195 200 205 Asn Asn Ser Ile Val Phe Glu Ala Glu Thr Met Ile Leu His Leu Pr 210 215 220 Gly Cys Val Pro Cys Ile Lys Ala Gly Asn Glu Ser Arg Cys Trp Le 225 230 235 240 Pro Val Ser Pro Thr Leu Ala Val Pro Asn Ser Ser Val Pro Ile Hi 245 250 255 Gly Phe Arg Arg His Val Asp Leu Leu Val Gly Ala Ala Ala Phe Cy 260 265 270 Ser Ala Met Tyr Ile Gly Asp Leu Cys Gly Ser Ile Ile Leu Val Gl 275 280 285 Gln Leu Phe Thr Phe Arg Pro Lys Tyr His Gln Val Thr Gln Asp Cy 290 295 300 Asn Cys Ser Xaa Asn Xaa Gly His Val Thr Gly His Arg Met Ala 305 310 315 957 base pairs nucleic acid single linear cDNA NO NO 43 ATGAGCACAC TTCCAAAACC CCAAAGAAAA ACCAAAAGAA ACACCATCCG CCGCCCACAG 60 GACGTCAAGT TCCCGGGTGG CGGCCAGATC GTTGGTGGAG TCTACTTGCT GCCGCGCAGG 120 GGCCCGCGCT TGGGTGTGCG CGCGACGAGA AAGACTTCTG AACGGTCCCA GCCCAGAGGT 180 AGGCGCCAAC CAATACCCAA AGTGCGCCAC CAAACGGGCC GTACCTGGGC CCAGCCCGGG 240 TACCCCTGGC CTCTTTATGG AAATGAGGGC TGTGGTTGGG CAGGCTGGCT CCTGTCCCCC 300 CGCGGCTCTC GCCCAAATTG GGGCCCAAAC GACCCCCGGC GGAGGTCCCG CAACTTGGGT 360 AAAGTCATCG ACACCCTTAC TTGCGGCTTC GCCGACCTCA TGGGGTATAT CCCTGTCGTA 420 GGCGCTCCGW TGGGAGGCGT CGCGGNGGCC TTGGCGCATG GGGTCANGGN CATCGAGGAC 480 GGNGTAAATT ACGCAACAGN GAATCTTCCC GGNNGCTCTN TCTCTATCTT NCTCTTGGCA 540 CTTCTCTCGT GCCTTACAAC ACCAGCCTCC GCGGCGCATT ATACCAACAA GTCTGGCCTG 600 TACCATCTCA CCAACGACTG CCCCAACAGC AGCATCGTTT ATGAGGCGGA GACACTGATT 660 TTGCACTTGC CTGGGTGTGT ACCTTGTGTG AAGRTGRACA ATCAATCCCG GTGCTGGGTG 720 CAGGCCTCCC CGACCCTGGC AGTGCCGAAC GCGTCTACGC CAGTCACCGG GTTCCGCAAA 780 CATGTGGACA TCATGGTGGG CGCTGCCGCG TTCTGTTCAG CTATGTATGT GGGGGACCTG 840 TGCGGGGGCC TTTTCCTCGT TGGACAGCTC TTCACGCTCA GGCCTCGGAT GCATCAGGTT 900 GTCCAGGAGT GTAACTGTTC CATCTACACA GGGCATATCA CTGGACACCG AATGGCA 957 319 amino acids amino acid linear peptide 44 Met Ser Thr Leu Pro Lys Pro Gln Arg Lys Thr Lys Arg Asn Thr Il 1 5 10 15 Arg Arg Pro Gln Asp Val Lys Phe Pro Gly Gly Gly Gln Ile Val Gl 20 25 30 Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg Leu Gly Val Arg Al 35 40 45 Thr Arg Lys Thr Ser Glu Arg Ser Gln Pro Arg Gly Arg Arg Gln Pr 50 55 60 Ile Pro Lys Val Arg His Gln Thr Gly Arg Thr Trp Ala Gln Pro Gl 65 70 75 80 Tyr Pro Trp Pro Leu Tyr Gly Asn Glu Gly Cys Gly Trp Ala Gly Tr 85 90 95 Leu Leu Ser Pro Arg Gly Ser Arg Pro Asn Trp Gly Pro Asn Asp Pr 100 105 110 Arg Arg Arg Ser Arg Asn Leu Gly Lys Val Ile Asp Thr Leu Thr Cy 115 120 125 Gly Phe Ala Asp Leu Met Gly Tyr Ile Pro Val Val Gly Ala Pro Xa 130 135 140 Gly Gly Val Ala Xaa Ala Leu Ala His Gly Val Xaa Xaa Ile Glu As 145 150 155 160 Xaa Val Asn Tyr Ala Thr Xaa Asn Leu Pro Xaa Xaa Ser Xaa Ser Il 165 170 175 Xaa Leu Leu Ala Leu Leu Ser Cys Leu Thr Thr Pro Ala Ser Ala Al 180 185 190 His Tyr Thr Asn Lys Ser Gly Leu Tyr His Leu Thr Asn Asp Cys Pr 195 200 205 Asn Ser Ser Ile Val Tyr Glu Ala Glu Thr Leu Ile Leu His Leu Pr 210 215 220 Gly Cys Val Pro Cys Val Lys Xaa Xaa Asn Gln Ser Arg Cys Trp Va 225 230 235 240 Gln Ala Ser Pro Thr Leu Ala Val Pro Asn Ala Ser Thr Pro Val Th 245 250 255 Gly Phe Arg Lys His Val Asp Ile Met Val Gly Ala Ala Ala Phe Cy 260 265 270 Ser Ala Met Tyr Val Gly Asp Leu Cys Gly Gly Leu Phe Leu Val Gl 275 280 285 Gln Leu Phe Thr Leu Arg Pro Arg Met His Gln Val Val Gln Glu Cy 290 295 300 Asn Cys Ser Ile Tyr Thr Gly His Ile Thr Gly His Arg Met Ala 305 310 315 413 base pairs nucleic acid single linear cDNA NO NO 45 ATGAGCACAC TTCCTAAACC TCAAAGAAAA ACCAAACGAA ACACCAACCG TCGCCCACAG 60 GACGTCAAGT TCCCGGGTGG CGGTCAGATC GTTGGTGGAG TTTACTTGTT GCCGCGCAGG 120 GGCCCTCGTT TGGGTGTGCG CGCGACGAGG AAAACTTCTG AACGGTCCCA GCCCAGGGGT 180 AGACGCCAAC CTATACCGAA GGTGCGTCAC CAAACGGGCC GTACCTGGGC TCAACCCGGG 240 TACCCCTGGC CTCTTTATGG GAATGAGGGT TGTGGCTGGG CAGGGTGGCT CCTGTCCCCC 300 CNCGGCTCTC GCCCTAATTG GGGCCCTAAT GACCCCCGGN GGAGGTCCCG CAACCTGGGT 360 AAGGTCATCG ATACCCTTAC TTGNGGSTTC GCCGACCTCA TAGAGTACAT TCC 413 137 amino acids amino acid linear peptide 46 Met Ser Thr Leu Pro Lys Pro Gln Arg Lys Thr Lys Arg Asn Thr As 1 5 10 15 Arg Arg Pro Gln Asp Val Lys Phe Pro Gly Gly Gly Gln Ile Val Gl 20 25 30 Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg Leu Gly Val Arg Al 35 40 45 Thr Arg Lys Thr Ser Glu Arg Ser Gln Pro Arg Gly Arg Arg Gln Pr 50 55 60 Ile Pro Lys Val Arg His Gln Thr Gly Arg Thr Trp Ala Gln Pro Gl 65 70 75 80 Tyr Pro Trp Pro Leu Tyr Gly Asn Glu Gly Cys Gly Trp Ala Gly Tr 85 90 95 Leu Leu Ser Pro Xaa Gly Ser Arg Pro Asn Trp Gly Pro Asn Asp Pr 100 105 110 Arg Xaa Arg Ser Arg Asn Leu Gly Lys Val Ile Asp Thr Leu Thr Xa 115 120 125 Xaa Phe Ala Asp Leu Ile Glu Tyr Ile 130 135 957 base pairs nucleic acid single linear cDNA NO NO 47 ATGAGCACAC TTCCAAAACC CCAAAGAAAA ACCAAAAGAA ACACAAACCG TCGCCCAATG 60 GATGTCAAGT TCCCGGGCGG CGGTCAGATC GTTGGTGGAG TCTACTTGTT ACCGCGCAGG 120 GGCCCACGTT TGGGTGTGCG CGCGACGAGG AAGACTTCGG AACGGTCCCA GGCCAGAGGT 180 AGGCGCCAAC CAATACCCAA GGTGCGCCAG AACCAAGGCC GAACCTGGGC TCAGCCTGGG 240 TACCCCTGGC CCCTTTATGG GAACGAGGGC TGCGGCTGGG CGGGGTGGCT CTTGTCCCCC 300 CGTGGCTCTC GCCCGGACTG GGGNCCCAAT GACCCCCGGN GGAGGTCCCG CAACCTGGGT 360 AAGGTCATCG ACACCCTCAC TTGCGGCTTC GCCGACCTCA TGGAGTACAT CCCTGTCGTT 420 GGCGCCCCCC TTGGAGGCGT TGCGGCGGAA CTGGNACATG GTGTCAGGGC CATCGAGGAC 480 GGGATAAACT ATGCAACAGG GAATCTTCCT GGTTGCTCTT TCTCTATCTT CCWCTTGGCA 540 CTTCTCTCGT GCCTCACCAC GCCTGCCTCC GCACTAAACT ATGCTAACAA GTCTGGGCTG 600 TATCATCTAA CCAATGACTG CCCCAATAGC AGCATTGTGT ATGAGGCGAA TGGCATGATC 660 CTGCATCTCC CGGGTTGCGT CCCCTGCGTG AAGACCGGCA ACCTGACCAA GTGTTGGCTG 720 TCGGCCTCCC CGACATTGGC GGTGCAGAAT GCGTCGGTGT CCATCAGGGG TGTCCGCGAG 780 CACGTGGACC TCTTGGTGGG TGCTGCTGCG TTCTGCTCTG CCATGTACGT GGGCGACTTA 840 TGCGGTGGGC TCTTTCTCGT TGGGCAGTTG TTCACGTTCA GACCCAGGAT GTATGAGATC 900 GCCCAGGACT GCAACTGTTC CATCTATGCA GGCCACATCA CTGGGCACCG GATGGCG 957 319 amino acids amino acid linear peptide 48 Met Ser Thr Leu Pro Lys Pro Gln Arg Lys Thr Lys Arg Asn Thr As 1 5 10 15 Arg Arg Pro Met Asp Val Lys Phe Pro Gly Gly Gly Gln Ile Val Gl 20 25 30 Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg Leu Gly Val Arg Al 35 40 45 Thr Arg Lys Thr Ser Glu Arg Ser Gln Ala Arg Gly Arg Arg Gln Pr 50 55 60 Ile Pro Lys Val Arg Gln Asn Gln Gly Arg Thr Trp Ala Gln Pro Gl 65 70 75 80 Tyr Pro Trp Pro Leu Tyr Gly Asn Glu Gly Cys Gly Trp Ala Gly Tr 85 90 95 Leu Leu Ser Pro Arg Gly Ser Arg Pro Asp Trp Xaa Pro Asn Asp Pr 100 105 110 Arg Xaa Arg Ser Arg Asn Leu Gly Lys Val Ile Asp Thr Leu Thr Cy 115 120 125 Gly Phe Ala Asp Leu Met Glu Tyr Ile Pro Val Val Gly Ala Pro Le 130 135 140 Gly Gly Val Ala Ala Glu Leu Xaa His Gly Val Arg Ala Ile Glu As 145 150 155 160 Gly Ile Asn Tyr Ala Thr Gly Asn Leu Pro Gly Cys Ser Phe Ser Il 165 170 175 Phe Xaa Leu Ala Leu Leu Ser Cys Leu Thr Thr Pro Ala Ser Ala Le 180 185 190 Asn Tyr Ala Asn Lys Ser Gly Leu Tyr His Leu Thr Asn Asp Cys Pr 195 200 205 Asn Ser Ser Ile Val Tyr Glu Ala Asn Gly Met Ile Leu His Leu Pr 210 215 220 Gly Cys Val Pro Cys Val Lys Thr Gly Asn Leu Thr Lys Cys Trp Le 225 230 235 240 Ser Ala Ser Pro Thr Leu Ala Val Gln Asn Ala Ser Val Ser Ile Ar 245 250 255 Gly Val Arg Glu His Val Asp Leu Leu Val Gly Ala Ala Ala Phe Cy 260 265 270 Ser Ala Met Tyr Val Gly Asp Leu Cys Gly Gly Leu Phe Leu Val Gl 275 280 285 Gln Leu Phe Thr Phe Arg Pro Arg Met Tyr Glu Ile Ala Gln Asp Cy 290 295 300 Asn Cys Ser Ile Tyr Ala Gly His Ile Thr Gly His Arg Met Ala 305 310 315 309 base pairs nucleic acid single linear cDNA NO NO 49 ATGAGCACAC TTCCTAAACC ACAAAGAAAA ACCAAAAGAA ACACCAACCC CGGCCACAGG 60 ACGTTAAGTT CCCAGGCGGC GGTCAGATCG TTGGTGGAGT TTACGTGCTA CCACGCAGGG 120 GCCCCCAGTT GGGTGTGCGT GCAGTGCGCA AGACTTCCGA GCGGTCGCAA CCTCGCAGTA 180 GGCGCCAACC CATCCCCAGG GCGCGCCGAA CCGAGGGCAG GTCCTGGGCT CAGCCCGGGT 240 ACCCTTGGCC CCTATATGGG AATGAGGGCT GCGGGTGGGC AGGGTGGCTC CTGTCCCCGC 300 GCGGCTCTC 309 115 amino acids amino acid linear peptide 50 Met Ser Thr Leu Pro Lys Pro Gln Arg Lys Thr Lys Arg Asn Thr As 1 5 10 15 Xaa Arg Pro Gln Asp Val Lys Phe Pro Gly Gly Gly Gln Ile Val Gl 20 25 30 Gly Val Tyr Val Leu Pro Arg Arg Gly Pro Gln Leu Gly Val Arg Al 35 40 45 Val Arg Lys Thr Ser Glu Arg Ser Gln Pro Arg Ser Arg Arg Gln Pr 50 55 60 Ile Pro Arg Ala Arg Arg Thr Glu Gly Arg Ser Trp Ala Gln Pro Gl 65 70 75 80 Tyr Pro Trp Pro Leu Tyr Gly Asn Glu Gly Cys Gly Trp Ala Gly Tr 85 90 95 Leu Leu Ser Pro Arg Gly Ser Arg Pro Ser Trp Gly Pro Asn Asp Pr 100 105 110 Arg Arg Arg 115 447 base pairs nucleic acid single linear cDNA NO NO 51 GACGGAATTA ATTTCGCAAC AGGGAATTTA CCTGGTTGCT CTTTCTCTAT CTTCCTTCTG 60 GCTTTGTTCT CATGCTTGCT TACACCCACA GCCGGGCTGG AGTACCGTAA TGCCTCCGGA 120 CTCTACATGG TAACTAACGA CTGCAGTAAC GGTAGTATCG TGTATGAGGC CGGGGATATT 180 ATCCTCCACT TACCTGGCTG TGTCCCCTGC GTACGCTCTG GCAATACATC AAGATGCTGG 240 ATCCCTGTGA GCCCYACCGT CGCCGTGAAG TCGCCCTGCG CCGCCACCGC CTCTCTCCGC 300 ACGCACGTGG ATATGATGGT GGGRGCGGCC ACCCTATGCT CAGCTCTCTA CGTAGGAGAC 360 CTTTGTGGAG CGCTATTTCT TGTYGGGCAG GGGTTCTCAT GGAGACATCG CCAGCATTGG 420 ACTGTCCAGG ACTGCAACTG TTCCATC 447 149 amino acids amino acid linear peptide 52 Asp Gly Ile Asn Phe Ala Thr Gly Asn Leu Pro Gly Cys Ser Phe Se 1 5 10 15 Ile Phe Leu Leu Ala Leu Phe Ser Cys Leu Leu Thr Pro Thr Ala Gl 20 25 30 Leu Glu Tyr Arg Asn Ala Ser Gly Leu Tyr Met Val Thr Asn Asp Cy 35 40 45 Ser Asn Gly Ser Ile Val Tyr Glu Ala Gly Asp Ile Ile Leu His Le 50 55 60 Pro Gly Cys Val Pro Cys Val Arg Ser Gly Asn Thr Ser Arg Cys Tr 65 70 75 80 Ile Pro Val Ser Xaa Thr Val Ala Val Lys Ser Pro Cys Ala Ala Th 85 90 95 Ala Ser Leu Arg Thr His Val Asp Met Met Val Xaa Ala Ala Thr Le 100 105 110 Cys Ser Ala Leu Tyr Val Gly Asp Leu Cys Gly Ala Leu Phe Leu Xa 115 120 125 Gly Gln Gly Phe Ser Trp Arg His Arg Gln His Trp Thr Val Gln As 130 135 140 Cys Asn Cys Ser Ile 145 340 base pairs nucleic acid single linear cDNA NO NO 53 CTCGACAGTT ACTGAGAATG ACATCCGTGT CGAGGAATCA ATATACCAAT GTTGTGACTT 60 GGCCCCCGAG GCTCGCAAGG CCATAAAGTC GCTCACCGAG CGGCTGTACA TCGGGGGCCC 120 YCTAACCAAT TCAAAAGGAC AGAACTGCGG CTACCGTCGG TGCCGCGCCA GCGGCGTGCT 180 GACTACCAGC TGCGGCAACA CCCTGACATG CTACTTGAAA GCCAGAGCGG CCTGTCGAGC 240 TGCAAAGCTC CGGGACTGCA CCATGCTCGT GTGCGGGGAT GACCTTGTCG TTATCTGTGA 300 GAGTGCGGGA GTCGAGGAAG ACGCGGCGAA CCTACGAGCT 340 113 amino acids amino acid linear peptide 54 Ser Thr Val Thr Glu Asn Asp Ile Arg Val Glu Glu Ser Ile Tyr Gl 1 5 10 15 Cys Cys Asp Leu Ala Pro Glu Ala Arg Lys Ala Ile Lys Ser Leu Th 20 25 30 Glu Arg Leu Tyr Ile Gly Gly Xaa Leu Thr Asn Ser Lys Gly Gln As 35 40 45 Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Leu Thr Thr Ser Cy 50 55 60 Gly Asn Thr Leu Thr Cys Tyr Leu Lys Ala Arg Ala Ala Cys Arg Al 65 70 75 80 Ala Lys Leu Arg Asp Cys Thr Met Leu Val Cys Gly Asp Asp Leu Va 85 90 95 Val Ile Cys Glu Ser Ala Gly Val Glu Glu Asp Ala Ala Asn Leu Ar 100 105 110 Ala 340 base pairs nucleic acid single linear cDNA NO NO 55 CTCGACAGTT ACTGAGAACG ACATCCGTAC CGAGGRATCA ATCTATCAAT GTTGTGACTT 60 GGCCCCYGAG GCCCGCAAGG CCATAAAGTC GCTCACCGAG CGGCTGTACG TCGGGGGCCC 120 CCTAACCAAT TCAAAGGGGC AGAACTGCGG CTATCGTCGG TGTCGCGCTA GCGGCGTGCT 180 GACCACCAGC TGCGGCAACA CCCTCACATG CTACTTGAAA GCCAGGGCGG CCTGTCGAGC 240 TGCAAAGCTC CAGGACTGCA CGATGCTCGT GTGCGGAGAC GACCTTGTCG TTATCTGTGA 300 GAGCGCGGGA GTCGAGGAGG ACGCGGCGAA CCTACGAGTC 340 113 amino acids amino acid linear peptide 56 Ser Thr Val Thr Glu Asn Asp Ile Arg Thr Glu Xaa Ser Ile Tyr Gl 1 5 10 15 Cys Cys Asp Leu Ala Xaa Glu Ala Arg Lys Ala Ile Lys Ser Leu Th 20 25 30 Glu Arg Leu Tyr Val Gly Gly Pro Leu Thr Asn Ser Lys Gly Gln As 35 40 45 Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Leu Thr Thr Ser Cy 50 55 60 Gly Asn Thr Leu Thr Cys Tyr Leu Lys Ala Arg Ala Ala Cys Arg Al 65 70 75 80 Ala Lys Leu Gln Asp Cys Thr Met Leu Val Cys Gly Asp Asp Leu Va 85 90 95 Val Ile Cys Glu Ser Ala Gly Val Glu Glu Asp Ala Ala Asn Leu Ar 100 105 110 Val 340 base pairs nucleic acid single linear cDNA NO NO 57 CTCGACAGTT ACTGAGAACG ACATTCGTGT CGAGGAATCA ATCTACCAGT GCTGTGACTT 60 GGCCCCCGAG GCCCGCAAGG CCATAAAGTC GCTCACCGAG CGGCTGTATA TCGGGGGTCC 120 CCTAACCAAC TCAAAAGGGC AGAACTGCGG CTACCGTCGG TGCCGCGCCA GCGGCGTGCT 180 GACTACCAGC TGCGGTAATA CCCTCACATG TTACTTGAAA GCCAGGGCGG CCTGTCGAGC 240 TGCGAAGCTC CAGGACTGCA CAATGCTCGT GTGCGGAGAC GACCTTGTCG TTATCTGTGA 300 GAGTGCRGGA GTCGAGGAGG ATGCGGCGAA CCTACGAGTC 340 113 amino acids amino acid linear peptide 58 Ser Thr Val Thr Glu Asn Asp Ile Arg Val Glu Glu Ser Ile Tyr Gl 1 5 10 15 Cys Cys Asp Leu Ala Pro Glu Ala Arg Lys Ala Ile Lys Ser Leu Th 20 25 30 Glu Arg Leu Tyr Ile Gly Gly Pro Leu Thr Asn Ser Lys Gly Gln As 35 40 45 Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Leu Thr Thr Ser Cy 50 55 60 Gly Asn Thr Leu Thr Cys Tyr Leu Lys Ala Arg Ala Ala Cys Arg Al 65 70 75 80 Ala Lys Leu Gln Asp Cys Thr Met Leu Val Cys Gly Asp Asp Leu Va 85 90 95 Val Ile Cys Glu Ser Xaa Gly Val Glu Glu Asp Ala Ala Asn Leu Ar 100 105 110 Val 652 base pairs nucleic acid single linear cDNA NO NO 59 CGTACAGCCT CCAGGACCCC CCCTCCCGGG AGAGCCATAG TGGTCTGCGG AACCGGTGAG 60 TACACCGGAA TTGCCAGGAC GACCGGGTCC TTTCTTGGAT CAACCCGCTC AATGCCTGGA 120 GATTTGGGCG TGCCCCCGCA AGACTGCTAG CCGAGTAGTG TTGGGTCGCG AAAGGCCTTG 180 TGGTACTGCC TGATAGGGTG CTTGCGAGTG CCCCGGGAGG TCTCGTAGAC CGTGCACCAT 240 GAGCACGAAT CCTAAACCTC AAAGAAAAAC CAAAAGAAAC ACCAACCGCC GCCCACAGGA 300 CGTCAAGTTC CCGGGCGGTG GCCAGATCGT TGGTGGAGTC TACGTGCTAC CGCGCAGGGG 360 CCCTAGATTG GGTGTGCGCG CAGCGCGGAA GACTTCGGAG CGGTCGCAAC CTCGTGGGAG 420 GCGCCAACCT ATTCCCAAGG AGCGCCGACC CGAGGGCAGG TCCTGGGCGC AGCCCGGGTA 480 CCCCTGGCCC CTCTATGGTA ACGAGGGCTG CGGGTGGGCA GGTNGGCTCC TGTCCCCTCG 540 CGGCTCCCGT CCTAGTTGGG GTCCTACTGA CCCCCGGCGT AGGTCACGCA ATTTGGGTAA 600 GGTCATCGAT ACCCTCACGT GTTGNTTCGC CGACCTCATG GGGTACATAC CG 652 138 amino acids amino acid linear peptide 60 Met Ser Thr Asn Pro Lys Pro Gln Arg Lys Thr Lys Arg Asn Thr As 1 5 10 15 Arg Arg Pro Gln Asp Val Lys Phe Pro Gly Gly Gly Gln Ile Val Gl 20 25 30 Gly Val Tyr Val Leu Pro Arg Arg Gly Pro Arg Leu Gly Val Arg Al 35 40 45 Ala Arg Lys Thr Ser Glu Arg Ser Gln Pro Arg Gly Arg Arg Gln Pr 50 55 60 Ile Pro Lys Glu Arg Arg Pro Glu Gly Arg Ser Trp Ala Gln Pro Gl 65 70 75 80 Tyr Pro Trp Pro Leu Tyr Gly Asn Glu Gly Cys Gly Trp Ala Gly Xa 85 90 95 Leu Leu Ser Pro Arg Gly Ser Arg Pro Ser Trp Gly Pro Thr Asp Pr 100 105 110 Arg Arg Arg Ser Arg Asn Leu Gly Lys Val Ile Asp Thr Leu Thr Cy 115 120 125 Xaa Phe Ala Asp Leu Met Gly Tyr Ile Pro 130 135 340 base pairs nucleic acid single linear cDNA NO NO 61 CTCAACGGTC ACTGAAGCTG ATATCCGAAC AGAGGAGTCC ATATACCAAT GCTGTGACCT 60 GCACCCCGAA GCACGTGTAG CCATCAAGTC TTTGACTGAA AGGCTGTACG TCGGGGGGCC 120 CTTGACCAAT TCAAAAGGGG AGAACTGCGG CTATCGCAGA TGCCGTGCCA GCGGCGTCTT 180 GACAACCAGC TGCGGCAACA CCCTCACCTG CTATATCAAG GCCCTAGCAG CCTGTAGAGC 240 TGCCAAGCTC CAGGACTGCA CCATGCTCGT CTGTGGCGAC GACCTGGTCG TGATCTGCGA 300 GAGTGTAGGG ACCCAGGAGG ATGCGGCGAG CCTGCGAGCC 340 113 amino acids amino acid linear peptide 62 Ser Thr Val Thr Glu Ala Asp Ile Arg Thr Glu Glu Ser Ile Tyr Gl 1 5 10 15 Cys Cys Asp Leu His Pro Glu Ala Arg Val Ala Ile Lys Ser Leu Th 20 25 30 Glu Arg Leu Tyr Val Gly Gly Pro Leu Thr Asn Ser Lys Gly Glu As 35 40 45 Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Leu Thr Thr Ser Cy 50 55 60 Gly Asn Thr Leu Thr Cys Tyr Ile Lys Ala Leu Ala Ala Cys Arg Al 65 70 75 80 Ala Lys Leu Gln Asp Cys Thr Met Leu Val Cys Gly Asp Asp Leu Va 85 90 95 Val Ile Cys Glu Ser Val Gly Thr Gln Glu Asp Ala Ala Ser Leu Ar 100 105 110 Ala 340 base pairs nucleic acid single linear cDNA NO NO 63 NTCAACAGTC ACTGAGAGTG ATATCCGTAC AGAGGAGTCC ATCTACCAAT GCTGTGATCT 60 AGACCCCGAG GCTCGCAAGG CCATAAGGTC CCTCACAGAG AGGCTTTATA TCGGGGGTCC 120 CCTGACAAAC TCAAAAGGGC AGAACTGCGG CTACCGCCGA TGCCGTGCAA GCGGCGTCCT 180 GACGACTAGC TGCGGCAACA CCCTCACCTG TTACATAAAG GCCAGGGCAG CCTGTCGAGC 240 TGCGAAGCTC CAGGATTGCT CAATGCTCGT CTGTGGCGAC GACCTTGTCG TTATCTGCGA 300 GATCGAGGGG NTCCANGAGG ATCCGTCGAN NNNNNNNNNN 340 113 amino acids amino acid linear peptide 64 Ser Thr Val Thr Glu Ser Asp Ile Arg Thr Glu Glu Ser Ile Tyr Gl 1 5 10 15 Cys Cys Asp Leu Asp Pro Glu Ala Arg Lys Ala Ile Arg Ser Leu Th 20 25 30 Glu Arg Leu Tyr Ile Gly Gly Pro Leu Thr Asn Ser Lys Gly Gln As 35 40 45 Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Leu Thr Thr Ser Cy 50 55 60 Gly Asn Thr Leu Thr Cys Tyr Ile Lys Ala Arg Ala Ala Cys Arg Al 65 70 75 80 Ala Lys Leu Gln Asp Cys Ser Met Leu Val Cys Gly Asp Asp Leu Va 85 90 95 Val Ile Cys Glu Ile Glu Gly Xaa Xaa Glu Asp Pro Ser Xaa Xaa Xa 100 105 110 Xaa 831 base pairs nucleic acid single linear cDNA NO NO 65 CGTAGACCGT GCACCATGAG CACGAATCCT AAACCTCAAA GAAAAACCAA ACGTAACATC 60 AACCGCCGCC CACAGGACGT CAAGTTCCCG GGCGGTGGCC AGATCGTCGG TGGAGTTTAC 120 CTGTTGCCGC GCAGGGGCCC TAGATTGGGT GTGCGCGCGA CTAGGAAGAC TTCCGAGCGG 180 TCGCAACCTC GTGGGAGGCG ACAGCCTATC CCCAAGGCTC GCCGATCCGA GGGCAGGTCC 240 TGGGCTCAGC CCGGGTACCC TTGGCCCCTC TATGGCAATG AGGGCATGGG TTGGGCAGGG 300 TGGCTCCTGT CCCCCCATGG CTCCCGGCCT AGTTGGGGCC CTTCAGACCC CCGGCGTAGG 360 TCGCGTAATT TGGGTAAGGT CATCGATACC CTCACATGCG GCTTCGCCGA CCTCATGGGG 420 TACATTCCGC TCGTCGGCGC CCCCCTAGGG GGCGTTGCCA GGGCCCTGGC GCAAGGCTTC 480 CGGGATCTAC CACGTCACCA ACGATTGTTC CAATGGGAGC ATTGTGTATG AGGCGGAAGG 540 CATGATCATG CATCTCCCCG GGTGCGTGCC CTGCGTTCGG GAAGGTAATA TCTCTCGTTG 600 CTGGGTACCG TTTTCCCCCA CGCTCGCAGC CAGGAATGCT AGCGTCCCCA CTCAGGCAAT 660 TCGGCGACAC GTCGACTTGC TTGTTGGGGC GGCCACACTC TGTTCTGCTA TGTATGTGGG 720 GGACCTCTGT GGGTCCGTCT TCCTCGTCGG CCAACTGTTC ACCTTCACAW CCCGCCAGNA 780 CTACACAGTG CAAGACTGCA ATTGTTCCAT CTACCCCGGC CATATAACGG G 831 158 amino acids amino acid linear peptide 66 Met Ser Thr Asn Pro Lys Pro Gln Arg Lys Thr Lys Arg Asn Ile As 1 5 10 15 Arg Arg Pro Gln Asp Val Lys Phe Pro Gly Gly Gly Gln Ile Val Gl 20 25 30 Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg Leu Gly Val Arg Al 35 40 45 Thr Arg Lys Thr Ser Glu Arg Ser Gln Pro Arg Gly Arg Arg Gln Pr 50 55 60 Ile Pro Lys Ala Arg Arg Ser Glu Gly Arg Ser Trp Ala Gln Pro Gl 65 70 75 80 Tyr Pro Trp Pro Leu Tyr Gly Asn Glu Gly Met Gly Trp Ala Gly Tr 85 90 95 Leu Leu Ser Pro His Gly Ser Arg Pro Ser Trp Gly Pro Ser Asp Pr 100 105 110 Arg Arg Arg Ser Arg Asn Leu Gly Lys Val Ile Asp Thr Leu Thr Cy 115 120 125 Gly Phe Ala Asp Leu Met Gly Tyr Ile Pro Leu Val Gly Ala Pro Le 130 135 140 Gly Gly Val Ala Arg Ala Leu Ala Gln Gly Phe Arg Asp Leu 145 150 155 340 base pairs nucleic acid single linear cDNA NO NO 67 NNNNNNNGTC ACTGAGAGTG ATATCCGTGT CGAGGARTCA ATTTACCAAT GCTGTGACCT 60 GGCCCCCGAG GCTCGCGTAG CCATAAAGTC GCTCACTGAG CGGCTATATG TCGGGGGCCC 120 TCTCACCAAC TCAAAAGGAC AGAACTGCGG CTATCGCCGG TGCCGTGCGA GCGGTGTGCT 180 GACTACTAGC TGCGGTAACA CCCTCACATG CTACCTGAAA GCCGCCGCGG CCTGTCGAGC 240 TGCAAAGCTC CGGGAATGCA CAATGCTCGT GTGTGGCGAC GACCTCGTCG TTATCTGTGA 300 GAGTGCGGGG GTCCAGGAGG ATGCTGCAAG CCTNNNNNNN 340 113 amino acids amino acid linear peptide 68 Xaa Xaa Val Thr Glu Ser Asp Ile Arg Val Glu Xaa Ser Ile Tyr Gl 1 5 10 15 Cys Cys Asp Leu Ala Pro Glu Ala Arg Val Ala Ile Lys Ser Leu Th 20 25 30 Glu Arg Leu Tyr Val Gly Gly Pro Leu Thr Asn Ser Lys Gly Gln As 35 40 45 Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Leu Thr Thr Ser Cy 50 55 60 Gly Asn Thr Leu Thr Cys Tyr Leu Lys Ala Ala Ala Ala Cys Arg Al 65 70 75 80 Ala Lys Leu Arg Glu Cys Thr Met Leu Val Cys Gly Asp Asp Leu Va 85 90 95 Val Ile Cys Glu Ser Ala Gly Val Gln Glu Asp Ala Ala Ser Xaa Xa 100 105 110 Xaa 340 base pairs nucleic acid single linear cDNA NO NO 69 CTCGACAGTC ACAGAGAGAG ATATAAGNAC TGAGGAGTCC ATATACCAGG CTTGTTCCTT 60 ACCCGAGCAG GCCAGAACTG CCATACACTC ATTGACTGAG AGACTCTACG TAGGAGGGCC 120 CATGATGAAC AGCAAAGGGC AATCCTGCGG ATACAGGCAT TGCCGCGCCA GCGGAGTGCT 180 CACCACCAGT ATGGGGAATA CCATCACGTG CTACATCAAG GCCCTAGCGG CTTGTAAAGC 240 AGCAGGAATA GTGGCCCCCA CCATGCTGGT GTGCGGCGAT GACCTAGTTG TCATCTCAGA 300 GAGTCAGGGA GTCGAGGAGG ACGACCGGAA CCTGANNNNN 340 113 amino acids amino acid linear peptide 70 Ser Thr Val Thr Glu Arg Asp Ile Xaa Thr Glu Glu Ser Ile Tyr Gl 1 5 10 15 Ala Cys Ser Leu Pro Glu Gln Ala Arg Thr Ala Ile His Ser Leu Th 20 25 30 Glu Arg Leu Tyr Val Gly Gly Pro Met Met Asn Ser Lys Gly Gln Se 35 40 45 Cys Gly Tyr Arg His Cys Arg Ala Ser Gly Val Leu Thr Thr Ser Me 50 55 60 Gly Asn Thr Ile Thr Cys Tyr Ile Lys Ala Leu Ala Ala Cys Lys Al 65 70 75 80 Ala Gly Ile Val Ala Pro Thr Met Leu Val Cys Gly Asp Asp Leu Va 85 90 95 Val Ile Ser Glu Ser Gln Gly Val Glu Glu Asp Asp Arg Asn Leu Xa 100 105 110 Xaa 340 base pairs nucleic acid single linear cDNA NO NO 71 CTCAACCGTC ACAGAGAGGG ATATAAGAAC TGAGGAGTCC ATATACCTGG CCTGCTCCTT 60 ACCCGAGCAG GCCCGGACTG CCATACATTC ATTAACTGAG AGACTTTACG TGGGAGGGCC 120 CATGATGAAC AGCAAAGGGC AGTCCTGCGG ATACAGGCGT TGCCGCGCTA GCGGAGTGCT 180 CACCACCAGT ATGGGGAACA CCATCACGTG TTATGTGAAA GCCCTCGCAG CTTGTAAAGC 240 TGCGGGCATT GTTGCCCCCA CGATGCTGGT GTGCGGCGAT GACCTGGTTG TCATCTCAGA 300 GAGTCAGGGG GCTGAGGAGG ACGAGCGAAA CCTGAGAGTC 340 113 amino acids amino acid linear peptide 72 Ser Thr Val Thr Glu Arg Asp Ile Arg Thr Glu Glu Ser Ile Tyr Le 1 5 10 15 Ala Cys Ser Leu Pro Glu Gln Ala Arg Thr Ala Ile His Ser Leu Th 20 25 30 Glu Arg Leu Tyr Val Gly Gly Pro Met Met Asn Ser Lys Gly Gln Se 35 40 45 Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Leu Thr Thr Ser Me 50 55 60 Gly Asn Thr Ile Thr Cys Tyr Val Lys Ala Leu Ala Ala Cys Lys Al 65 70 75 80 Ala Gly Ile Val Ala Pro Thr Met Leu Val Cys Gly Asp Asp Leu Va 85 90 95 Val Ile Ser Glu Ser Gln Gly Ala Glu Glu Asp Glu Arg Asn Leu Ar 100 105 110 Val 340 base pairs nucleic acid single linear cDNA NO NO 73 CTCAACAGTC GCGGAGAGAG ACATCAGGAC CGAGGAGTCC ATTTACCTTG CCTGCTCCTT 60 ACCCGAGCAA GCCCGAACTG CCATACATTC ATTGACTGAG AGACTTTACG TAGGAGGGCC 120 CATGATGAAC AGCAAGGGAC AGTCCTGCGG TTACAGACGT TGCCGCGCCA GCGGAGTGCT 180 CACCACCAGC ATGGGGAATA CCATCACATG CTATGTGAAG GCATTAGCTG CCTGCAAAGC 240 TGCAGGCATC GTTGCTCCCA CGATGCTGGT TTGTGGCGAC GATCTGGTCA TCATCTCAGA 300 GAGTCAGGGA ACCGAGGAGG ATGAGCGGAA CCTGAGAGTC 340 113 amino acids amino acid linear peptide 74 Ser Thr Val Ala Glu Arg Asp Ile Arg Thr Glu Glu Ser Ile Tyr Le 1 5 10 15 Ala Cys Ser Leu Pro Glu Gln Ala Arg Thr Ala Ile His Ser Leu Th 20 25 30 Glu Arg Leu Tyr Val Gly Gly Pro Met Met Asn Ser Lys Gly Gln Se 35 40 45 Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Leu Thr Thr Ser Me 50 55 60 Gly Asn Thr Ile Thr Cys Tyr Val Lys Ala Leu Ala Ala Cys Lys Al 65 70 75 80 Ala Gly Ile Val Ala Pro Thr Met Leu Val Cys Gly Asp Asp Leu Va 85 90 95 Ile Ile Ser Glu Ser Gln Gly Thr Glu Glu Asp Glu Arg Asn Leu Ar 100 105 110 Val 1195 base pairs nucleic acid single linear cDNA NO NO 75 CGNACANCCT CCAGGCCCCC CCCTCCCGGG AGAGCCATAG TGGTCTGCGG AACCGGTGAG 60 TACACCGGAA TTGCCGGGAA GACTGGGTCC TTTCTTGGAT AAACCCACTC TATGCCCGGC 120 CATTTGGGCG TGCCCCCGCA AGACTGCTAR CCGAGTAGCG TTGGGTTGCG AAAGGCCTTG 180 TGGTACTGCC TGATAGGGTG CTTGCGAGTG CCCCGGGAGG TCTCGTAGAC CGTGCATCAT 240 GAGCACAAAT CCTAAACCTC AAAGAAAAAC CAAAAGAAAC ACTAACCGCC GCCCACAGGA 300 CGTTAAGTTC CCGGGCGGTG GCCAGATCGT TGGCGGAGTA TACTTGTTGC CNTGCAGGGG 360 NCCCAGGTNG NGTNTATGCG CAACGANGAA GACTNCCGAA CAGTCCCAGC CACGTGGGAG 420 GCGCCAGCCC ATCCCGAAAG ATCGGNGCAC CACTGGCAAG TCCTGGGGAC GTCCAGGATA 480 TCCCTGGCCC CTGTATGGGA ACGAGGGCCT CGGGTGGGCA GGGTGGCTCC TGTCCCCCCG 540 GGGCTCCCGC CCGTCATGGG GCCCCACGGA CCCCCGGCAT AGGTCGCGCA ACTTGGGTAA 600 GGTCATCGAT ACCCTCACGT NCGGCTTTNC CGACCTCATG GGGTACATTC CCGTCGTTGG 660 CGCCCCAGTA GGNGGCGTCG CCAGAGCTCT CGCGCATGGC GTGAGAGTCC TGGAGGACGG 720 GATAAACTAT GAAACAGGGA ACCTCCCCGG TTGCTCTTTC TCTATCTCCC TCCTTGCTCT 780 TCTGTCCTGA ATTACCGNGC CAGTTTCTGC TGTGGAAATC AAAAACACCA GMAACACATA 840 CATGGTGACT AACGACTGTT CAAACAGYAG CATCACCTGG CAGCTTNNGN NCGCGGTGCT 900 TCACGTTCCT GGATGCGTCC CCTGTGAACG AGAGGGCAAC AGTTCCCGGT GCTGGATTCC 960 AGTCACGCCC RACGTAKNCG TGAGCCGACC TGGTGCCCTA ACCGAGGGTT TGCGATCGCA 1020 CATCGACACC ATCGTAGCGT CCGCAACATT TTGTTCTGCC CTCTACATAG GGGATGTATG 1080 TGGCGCGATA ATGATAGCTG CCCAAGTGGT CATCGTCTCG CCGGAGCATC ATCACTTTGT 1140 CCAGGACTGT AACTGTTCCA TCTACCCGGG CCACATAACG GGGCCTCGTA TGTNG 1195 318 amino acids amino acid linear peptide 76 Met Ser Thr Asn Pro Lys Pro Gln Arg Lys Thr Lys Arg Asn Thr As 1 5 10 15 Arg Arg Pro Gln Asp Val Lys Phe Pro Gly Gly Gly Gln Ile Val Gl 20 25 30 Gly Val Tyr Leu Leu Xaa Cys Arg Xaa Pro Arg Xaa Xaa Xaa Cys Al 35 40 45 Thr Xaa Lys Thr Xaa Glu Gln Ser Gln Pro Arg Gly Arg Arg Gln Pr 50 55 60 Ile Pro Lys Asp Arg Xaa Thr Thr Gly Lys Ser Trp Gly Arg Pro Gl 65 70 75 80 Tyr Pro Trp Pro Leu Tyr Gly Asn Glu Gly Leu Gly Trp Ala Gly Tr 85 90 95 Leu Leu Ser Pro Arg Gly Ser Arg Pro Ser Trp Gly Pro Thr Asp Pr 100 105 110 Arg His Arg Ser Arg Asn Leu Gly Lys Val Ile Asp Thr Leu Thr Xa 115 120 125 Gly Phe Xaa Asp Leu Met Gly Tyr Ile Pro Val Val Gly Ala Pro Va 130 135 140 Xaa Gly Val Ala Arg Ala Leu Ala His Gly Val Arg Val Leu Glu As 145 150 155 160 Gly Ile Asn Tyr Glu Thr Gly Asn Leu Pro Gly Cys Ser Phe Ser Il 165 170 175 Ser Leu Leu Ala Leu Leu Ser Ile Thr Xaa Pro Val Ser Ala Val Gl 180 185 190 Ile Lys Asn Thr Xaa Asn Thr Tyr Met Val Thr Asn Asp Cys Ser As 195 200 205 Xaa Ser Ile Thr Trp Gln Leu Xaa Xaa Ala Val Leu His Val Pro Gl 210 215 220 Cys Val Pro Cys Glu Arg Glu Gly Asn Ser Ser Arg Cys Trp Ile Pr 225 230 235 240 Val Thr Pro Xaa Val Xaa Val Ser Arg Pro Gly Ala Leu Thr Glu Gl 245 250 255 Leu Arg Ser His Ile Asp Thr Ile Val Ala Ser Ala Thr Phe Cys Se 260 265 270 Ala Leu Tyr Ile Gly Asp Val Cys Gly Ala Ile Met Ile Ala Ala Gl 275 280 285 Val Val Ile Val Ser Pro Glu His His His Phe Val Gln Asp Cys As 290 295 300 Cys Ser Ile Tyr Pro Gly His Ile Thr Gly Pro Arg Met Xaa 305 310 315 340 base pairs nucleic acid single linear cDNA NO NO 77 ATCCACAGTC ACTGAAAGAG ACATCAGAGT TGAAGAGTCC GTTTATCTGT CCTGTTCACT 60 TCCCGAGGAG GCCCGAGCTG CCATACACTC ACTAACTGAG AGGCTGTACG TGGGAGGTCC 120 CATGCAGAAC AGCAAGGGGC AATCCTGCGG ATACAGGCGC TGCCGCGCCA GCGGGGTGCT 180 CACCACTAGC ATGGGGAATA CTCTCACATG CTACTTGAAG GCCCAGGCGG CCTGCAGGGC 240 CGCGGGCATT GTTGCACCCA CAATGCTGGT GTGTGGCGAC GACCTGGTCG TCATCTCAGA 300 GAGTCAGGGG ACTGAGAGGG ACGAGAACAA CCTGAGACCT 340 113 amino acids amino acid linear peptide 78 Ser Thr Val Thr Glu Arg Asp Ile Arg Val Glu Glu Ser Val Tyr Le 1 5 10 15 Ser Cys Ser Leu Pro Glu Glu Ala Arg Ala Ala Ile His Ser Leu Th 20 25 30 Glu Arg Leu Tyr Val Gly Gly Pro Met Gln Asn Ser Lys Gly Gln Se 35 40 45 Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Leu Thr Thr Ser Me 50 55 60 Gly Asn Thr Leu Thr Cys Tyr Leu Lys Ala Gln Ala Ala Cys Arg Al 65 70 75 80 Ala Gly Ile Val Ala Pro Thr Met Leu Val Cys Gly Asp Asp Leu Va 85 90 95 Val Ile Ser Glu Ser Gln Gly Thr Glu Arg Asp Glu Asn Asn Leu Ar 100 105 110 Pro 340 base pairs nucleic acid single linear cDNA NO NO 79 CTCAACAGTC ACGGAGAGGG ACATCAGGAA TGAGGAGTCC ATATTCCTGG CCTGCTCGTT 60 GCCCGAGGAG GCCCGGACTG TCATACATTC GCTCACTGAG AGACTCTACA TAGGCGGGCC 120 GATGATGAAC AGCAAAGGCC AGTCCTGTGG ATACAGGCGT TGTCGCGCCA GCGGGGTGTT 180 CACCACTAGC ATGGGCAATA CCATCACGTG CTATGTGAAA GCCATGGCAG CTTGCAGAGC 240 TGCCGGGATT GACGCCCCCA CAATGTTGGT ATGTGGCGAC GACCTGGTGG TCATCTCAGA 300 GAGTCAGGGG ACCGAGGAGG ACGAGCGAAA TCTGAGAGTC 340 113 amino acids amino acid linear peptide 80 Ser Thr Val Thr Glu Arg Asp Ile Arg Asn Glu Glu Ser Ile Phe Le 1 5 10 15 Ala Cys Ser Leu Pro Glu Glu Ala Arg Thr Val Ile His Ser Leu Th 20 25 30 Glu Arg Leu Tyr Ile Gly Gly Pro Met Met Asn Ser Lys Gly Gln Se 35 40 45 Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Phe Thr Thr Ser Me 50 55 60 Gly Asn Thr Ile Thr Cys Tyr Val Lys Ala Met Ala Ala Cys Arg Al 65 70 75 80 Ala Gly Ile Asp Ala Pro Thr Met Leu Val Cys Gly Asp Asp Leu Va 85 90 95 Val Ile Ser Glu Ser Gln Gly Thr Glu Glu Asp Glu Arg Asn Leu Ar 100 105 110 Val 340 base pairs nucleic acid single linear cDNA NO NO 81 CTCTTGACTC TACTGTCACT GAACAGGATA TCAGGGTAGA AGAAGAAATA TACCAATGTT 60 GTGACCTTGA GCCGGAGGCT AGACGGGCAA TCAAATCGCT CACGGAACGG CTTTACGTTG 120 GAGGTCCCAT GTTCAACAGC AAGGGGCTCA AATGCGGATA TCGCCGTTGC CGTGCTAGCG 180 GTGTATTGCC CACTAGCTAC GGTAATACAA TCACCTGCTA CATCAAGGCC AGAGCGGCTG 240 CTCGAGCTGC GGGCCTTCAA GACCCATCAT TCCTTGTCTG CGGAGATGAT TTGGTGGTAG 300 TGGCTGAGAG TTGCGKCGTT GATGAGGAGG ATAGGGCAGC 340 113 amino acids amino acid linear peptide 82 Ser Thr Val Thr Glu Gln Asp Ile Arg Val Glu Glu Glu Ile Tyr Gl 1 5 10 15 Cys Cys Asp Leu Glu Pro Glu Ala Arg Arg Ala Ile Lys Ser Leu Th 20 25 30 Glu Arg Leu Tyr Val Gly Gly Pro Met Phe Asn Ser Lys Gly Leu Ly 35 40 45 Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Leu Pro Thr Ser Ty 50 55 60 Gly Asn Thr Ile Thr Cys Tyr Ile Lys Ala Arg Ala Ala Ala Arg Al 65 70 75 80 Ala Gly Leu Gln Asp Pro Ser Phe Leu Val Cys Gly Asp Asp Leu Va 85 90 95 Val Val Ala Glu Ser Cys Xaa Val Asp Glu Glu Asp Arg Ala Ala Le 100 105 110 Arg 340 base pairs nucleic acid single linear cDNA NO NO 83 CTCCACTGTA ACCGAAAAGG ACATCAGGCC CGAGGAAGAG GTCTATCAGT GTTGTGACCT 60 GGAGCCCGAA GCTCGCAAGG TTATTACCGC CCTCACAGAA AGACTCTACG TGGGCGGCCC 120 CATGCACAAC AGCAAGGGAG ACCTTTGTGG GTATCGGAGA TGCCGCGCAA GCGGCGTCTA 180 CACGACCAGC TTCGGAAACA CACTGACGTG CTACCTCAAA GCCTCAGCTG CTATTAGAGC 240 GGCAGGGCTG AGAGACTGCA CCATGCTGGT TTGCGGTGAC GACTTGGTCG TCATCGCTGA 300 GAGCGATGGC GTAGAGGAGG ATAACCGAGC CCTCCNAGCC 340 113 amino acids amino acid linear peptide 84 Ser Thr Val Thr Glu Lys Asp Ile Arg Pro Glu Glu Glu Val Tyr Gl 1 5 10 15 Cys Cys Asp Leu Glu Pro Glu Ala Arg Lys Val Ile Thr Ala Leu Th 20 25 30 Glu Arg Leu Tyr Val Gly Gly Pro Met His Asn Ser Lys Gly Asp Le 35 40 45 Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Tyr Thr Thr Ser Ph 50 55 60 Gly Asn Thr Leu Thr Cys Tyr Leu Lys Ala Ser Ala Ala Ile Arg Al 65 70 75 80 Ala Gly Leu Arg Asp Cys Thr Met Leu Val Cys Gly Asp Asp Leu Va 85 90 95 Val Ile Ala Glu Ser Asp Gly Val Glu Glu Asp Asn Arg Ala Leu Xa 100 105 110 Ala 340 base pairs nucleic acid single linear cDNA NO NO 85 CTCCACGGTG ACTGAAAAGG ACATCAGGGT CGAGGAAGAG ATCTATCAAT GTTGTGACCT 60 GGARCCCGAA GCCCGCAAAG CAATATCCGC CCTCACAGAG AGRCTCTACT TGGGCGGCCC 120 CATGTATAAC AGCAAAGGGG AGCTCTGCGG GTATCGGAGG TGCCGCGCGA GCGGAGTGTA 180 CACCACAAGT TTCGGGAACA CAGTGACCTG CTATCTTAAG GCCACCGCAG CTACCAGGGC 240 TGCAGGCCTA AAAGACTGCA CCATGCTGGT CTGCGGTGAC GACTTGGTCG TCATCGCCGA 300 GAGCGAGGGC GTAGAGGAGG ATTCCCAACC CCTCCGAGCC 340 113 amino acids amino acid linear peptide 86 Ser Thr Val Thr Glu Lys Asp Ile Arg Val Glu Glu Glu Ile Tyr Gl 1 5 10 15 Cys Cys Asp Leu Xaa Pro Glu Ala Arg Lys Ala Ile Ser Ala Leu Th 20 25 30 Glu Xaa Leu Tyr Leu Gly Gly Pro Met Tyr Asn Ser Lys Gly Glu Le 35 40 45 Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Tyr Thr Thr Ser Ph 50 55 60 Gly Asn Thr Val Thr Cys Tyr Leu Lys Ala Thr Ala Ala Thr Arg Al 65 70 75 80 Ala Gly Leu Lys Asp Cys Thr Met Leu Val Cys Gly Asp Asp Leu Va 85 90 95 Val Ile Ala Glu Ser Glu Gly Val Glu Glu Asp Ser Gln Pro Leu Ar 100 105 110 Ala 340 base pairs nucleic acid single linear cDNA NO NO 87 CTCCACCGTA ACCGAAAGGG ACATCAGGGT CGAGGAGGAG GTCTATCAGT GTTGTGATCT 60 GGAGCCAGAG GCCCGCAAGG CAATATCCGC CCTCACGGAG AGACTCTATG TGGGCGGTCC 120 CATGTTTAAC AGCAAGGGAG ACCTATGTGG CTACCGCAGG TGCCGCGCAA GCGGCGTCTA 180 CACCACCAGC TTCGGAAACA CACTGACCTG CTACCTCAAG GCCACGGCCG CTACCAGAGC 240 GGCCGGCCTG AAGGATTGCA CAATGCTGGT TTGCGGGGAC GACCTGGTCG TCATCGCAGA 300 GAGCGATGGC GTGGACGAGG ACCGCCGAGC CCTCCAAGCT 340 113 amino acids amino acid linear peptide 88 Ser Thr Val Thr Glu Arg Asp Ile Arg Val Glu Glu Glu Val Tyr Gl 1 5 10 15 Cys Cys Asp Leu Glu Pro Glu Ala Arg Lys Ala Ile Ser Ala Leu Th 20 25 30 Glu Arg Leu Tyr Val Gly Gly Pro Met Phe Asn Ser Lys Gly Asp Le 35 40 45 Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Tyr Thr Thr Ser Ph 50 55 60 Gly Asn Thr Leu Thr Cys Tyr Leu Lys Ala Thr Ala Ala Thr Arg Al 65 70 75 80 Ala Gly Leu Lys Asp Cys Thr Met Leu Val Cys Gly Asp Asp Leu Va 85 90 95 Val Ile Ala Glu Ser Asp Gly Val Asp Glu Asp Arg Arg Ala Leu Gl 100 105 110 Ala 340 base pairs nucleic acid single linear cDNA NO NO 89 CTCAACAGTC ACAGAGCGCG ATGTCCAGAC GGAGCATGAC ATCTACCAGT GCTGTAAGTT 60 GGAGCCCGCA GCACGGACAG CCATCACATC GCTTACTGAC CGATTGTACT NCGGTGGTCC 120 CATGTNTAAC TCTAAAGGTC AGGCATGTGG ATACCGTAGG TGCAGGGCCA GTGGCGTCTT 180 GACCACCATC CTGGCCAATA CTCTGACTTG CTACTTGAAA GCTCAGGCGG CATGCAGAGC 240 TGCCGGGCTG AAGGACTTTG ACATGTTGGT CTGCGGAGAC GACCTTGTCG TTATTTCGGA 300 GAGTTTGGGG GTCTCGGAGG ACACTAGTGC ACTGCGAGCT 340 113 amino acids amino acid linear peptide 90 Ser Thr Val Thr Glu Arg Asp Val Gln Thr Glu His Asp Ile Tyr Gl 1 5 10 15 Cys Cys Lys Leu Glu Pro Ala Ala Arg Thr Ala Ile Thr Ser Leu Th 20 25 30 Asp Arg Leu Tyr Xaa Gly Gly Pro Met Xaa Asn Ser Lys Gly Gln Al 35 40 45 Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Leu Thr Thr Ile Le 50 55 60 Ala Asn Thr Leu Thr Cys Tyr Leu Lys Ala Gln Ala Ala Cys Arg Al 65 70 75 80 Ala Gly Leu Lys Asp Phe Asp Met Leu Val Cys Gly Asp Asp Leu Va 85 90 95 Val Ile Ser Glu Ser Leu Gly Val Ser Glu Asp Thr Ser Ala Leu Ar 100 105 110 Ala 340 base pairs nucleic acid single linear cDNA NO NO 91 CTCGACAGTC ACCGAGCGCG ACATCCRCAC CGAGCACGAC ATCTACCAAT GCTGCCAACT 60 TGACCCGGTG GCACGCAAGG CTATTACATC TCTGACTGAG CGGCTGTACT GCGGWGGGCC 120 CATGATGAAC TCCCGTGGTC AATCATGTGG ATACCGTAGG TGCCGAGCCA GTGGCGTGCT 180 CACCACGAGC TTGGGCAATA CCCTAACATG CTATTTGAAA GCACAAGCAG CGTGTAGGGC 240 AGCAAAGCTC AAAAACTATG ACATGTTAGT CTGCGGAGAC GATCTAGTCG TTATCGCGGA 300 GAGTGGAGGA GTCTCTGAGG ATGTTGACGC CCTGCGAGCA 340 113 amino acids amino acid linear peptide 92 Ser Thr Val Thr Glu Arg Asp Ile Xaa Thr Glu His Asp Ile Tyr Gl 1 5 10 15 Cys Cys Gln Leu Asp Pro Val Ala Arg Lys Ala Ile Thr Ser Leu Th 20 25 30 Glu Arg Leu Tyr Cys Xaa Gly Pro Met Met Asn Ser Arg Gly Gln Se 35 40 45 Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Leu Thr Thr Ser Le 50 55 60 Gly Asn Thr Leu Thr Cys Tyr Leu Lys Ala Gln Ala Ala Cys Arg Al 65 70 75 80 Ala Lys Leu Lys Asn Tyr Asp Met Leu Val Cys Gly Asp Asp Leu Va 85 90 95 Val Ile Ala Glu Ser Gly Gly Val Ser Glu Asp Val Asp Ala Leu Ar 100 105 110 Ala 340 base pairs nucleic acid single linear cDNA NO NO 93 CTCCTCCGTC ACGGAGCGTG ACATCCGCAC TGAACACGAC ATCTATCAGT GCTGCCAATT 60 AGATCCGGTA GCACGGAAAG CCATTACATC TCTTACTGAG CGGCTGTACT GCGGCGGCCC 120 CATGTACAAC TCTCGAGGTC AGTCATGTGG GTACCGCAGG TGCCGGGCTA GTGGTGTCTT 180 CACCACAAGC TTGGGCAACA CCATGACATG CTACCTGAAG GCTCAGGCGG CTTGTAGGGC 240 AGCRAAGCTC AAAAACTTTG ACATGTTGGT CTGCGGAGAC GACCTAGTCG TTATTGCTGA 300 GAGCGGAGGA GTCCCTGAGG ATGCCGGGGC CCTGCGAGTC 340 113 amino acids amino acid linear peptide 94 Ser Ser Val Thr Glu Arg Asp Ile Arg Thr Glu His Asp Ile Tyr Gl 1 5 10 15 Cys Cys Gln Leu Asp Pro Val Ala Arg Lys Ala Ile Thr Ser Leu Th 20 25 30 Glu Arg Leu Tyr Cys Gly Gly Pro Met Tyr Asn Ser Arg Gly Gln Se 35 40 45 Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Phe Thr Thr Ser Le 50 55 60 Gly Asn Thr Met Thr Cys Tyr Leu Lys Ala Gln Ala Ala Cys Arg Al 65 70 75 80 Xaa Lys Leu Lys Asn Phe Asp Met Leu Val Cys Gly Asp Asp Leu Va 85 90 95 Val Ile Ala Glu Ser Gly Gly Val Pro Glu Asp Ala Gly Ala Leu Ar 100 105 110 Val 340 base pairs nucleic acid single linear cDNA NO NO 95 ATCCACAGTC ACGGGGCGCG ACATACGCAC AGAACNAGAC ATTTACCTGT CCTGCCAGCT 60 CGACCCAGAG GCCCGGAAAG CCATAAAGTC TCTCACTGAG AGGCTCTATG TCGGGGGCCC 120 TATGTACAAC TCAAAGGGCC AACTCTGTGG TCAACGCCGA TGCCGAGCAA GCGGAGTACT 180 CCCCACAAGC ATGGGTAACA CCATCACATG CTTCCTGAAG GCAACCGCCG CTTGCCGAGC 240 AGCCGGCTTT ACAGATTATG ACATGTTGGT CTGCGGAGAC GATTTGGTTG TCGTAACTGA 300 GAGTGCTGGA GTCAACGAGG ATATCGCTAA CCTGCGAGCC 340 113 amino acids amino acid linear peptide 96 Ser Thr Val Thr Gly Arg Asp Ile Arg Thr Glu Xaa Asp Ile Tyr Le 1 5 10 15 Ser Cys Gln Leu Asp Pro Glu Ala Arg Lys Ala Ile Lys Ser Leu Th 20 25 30 Glu Arg Leu Tyr Val Gly Gly Pro Met Tyr Asn Ser Lys Gly Gln Le 35 40 45 Cys Gly Gln Arg Arg Cys Arg Ala Ser Gly Val Leu Pro Thr Ser Me 50 55 60 Gly Asn Thr Ile Thr Cys Phe Leu Lys Ala Thr Ala Ala Cys Arg Al 65 70 75 80 Ala Gly Phe Thr Asp Tyr Asp Met Leu Val Cys Gly Asp Asp Leu Va 85 90 95 Val Val Thr Glu Ser Ala Gly Val Asn Glu Asp Ile Ala Asn Leu Ar 100 105 110 Ala 340 base pairs nucleic acid single linear cDNA NO NO 97 CTCCACTGTC ACTGAGCAGG ACATCAGGGT AGAACTTTCC ATCTTTCAGG CCTGTGACCT 60 CAAGGACGAG GCTAGGAGGG TGATAACTTC ACTCACGGAG CGGCTTTACT GTGGTGGTCC 120 TATGTTCAAC AGCAAGGGAC AACACTGCGG TTACCGCCGC TGCCGTGCTA GTGGGGTGCT 180 ACCCACCAGC TTCGGGAACA CAATCACCTG TTACATCAAA GCAAAGGCAG CTACCAAAGC 240 TGCCGGAATT AAAAATCCAT CATTCCTTGT CTGCGGAGAT GACTTGGTCG TGATTGCTGA 300 GAGTGCAGGG ATCGATGAGG ACAAGAGCGC CTTGAGAGCT 340 113 amino acids amino acid linear peptide 98 Ser Thr Val Thr Glu Gln Asp Ile Arg Val Glu Leu Ser Ile Phe Gl 1 5 10 15 Ala Cys Asp Leu Lys Asp Glu Ala Arg Arg Val Ile Thr Ser Leu Th 20 25 30 Glu Arg Leu Tyr Cys Gly Gly Pro Met Phe Asn Ser Lys Gly Gln Hi 35 40 45 Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Leu Pro Thr Ser Ph 50 55 60 Gly Asn Thr Ile Thr Cys Tyr Ile Lys Ala Lys Ala Ala Thr Lys Al 65 70 75 80 Ala Gly Ile Lys Asn Pro Ser Phe Leu Val Cys Gly Asp Asp Leu Va 85 90 95 Val Ile Ala Glu Ser Ala Gly Ile Asp Glu Asp Lys Ser Ala Leu Ar 100 105 110 Ala 340 base pairs nucleic acid single linear cDNA NO NO 99 CTCTACCGTC ACAGAGAGGG ACATACGGAC AGAAGAATCC ATCTATCTGT CTTGTCAATT 60 GCCTGAAGAG GCCCGGAAAG CCATTAAATC GCTGACAGAG AGACTATACG TGGGCGGCCC 120 GATGGAAAAC AGCAAGGGCC AGGCTTGCGG ATATAGGCGT TGCCGCGCAA GCGGGGTATT 180 CACCACAAGC TTGGGGAACA CCATGACTTG TTACATCAAA GCTAAAGCGG CTTGTAAAGC 240 CGCTGGCATT GTAGACCCGG TGATGCTCGT GTGCGGTGAC GACCTAGTGG TCATCTCAGA 300 AAGCAAGGGG GTGGAGGAGG ACCAGCGGGA CCTACGAGTC 340 113 amino acids amino acid linear peptide 100 Ser Thr Val Thr Glu Arg Asp Ile Arg Thr Glu Glu Ser Ile Tyr Le 1 5 10 15 Ser Cys Gln Leu Pro Glu Glu Ala Arg Lys Ala Ile Lys Ser Leu Th 20 25 30 Glu Arg Leu Tyr Val Gly Gly Pro Met Glu Asn Ser Lys Gly Gln Al 35 40 45 Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Phe Thr Thr Ser Le 50 55 60 Gly Asn Thr Met Thr Cys Tyr Ile Lys Ala Lys Ala Ala Cys Lys Al 65 70 75 80 Ala Gly Ile Val Asp Pro Val Met Leu Val Cys Gly Asp Asp Leu Va 85 90 95 Val Ile Ser Glu Ser Lys Gly Val Glu Glu Asp Gln Arg Asp Leu Ar 100 105 110 Val 335 base pairs nucleic acid single linear cDNA NO NO 101 CTCCACTGTC ACTGAGAGAG ACATACGGAC AGAAGAATCC ATCTAYYTGG CTTGTCAATT 60 GCCCGAAGAG GCCCGGAAGG CCATTAAATC ACTGACAGAG AGACTATACG TGGGCGGCCC 120 GATGGAAAAC AGCAAAGGCC AGGCCTGCGG ATATAGGCGT TGCCGCGCAA GCGGGGTATT 180 CACCACAAGC TTGGGGAACA CCATGACTTG TTACATCAAG GCCAARGCAG CTTGTAAAGC 240 YGCTGGCATT GTTGACCCGG TGATGCTCGT GTGCGGCGAC GACCTAGTGG TCATCTCAGA 300 GAGCAAGGGG GTAGAGGAGG ACCAGCGAGA CCTAC 335 113 amino acids amino acid linear peptide 102 Ser Thr Val Thr Glu Arg Asp Ile Arg Thr Glu Glu Ser Ile Xaa Xa 1 5 10 15 Ala Cys Gln Leu Pro Glu Glu Ala Arg Lys Ala Ile Lys Ser Leu Th 20 25 30 Glu Arg Leu Tyr Val Gly Gly Pro Met Glu Asn Ser Lys Gly Gln Al 35 40 45 Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Phe Thr Thr Ser Le 50 55 60 Gly Asn Thr Met Thr Cys Tyr Ile Lys Ala Xaa Ala Ala Cys Lys Xa 65 70 75 80 Ala Gly Ile Val Asp Pro Val Met Leu Val Cys Gly Asp Asp Leu Va 85 90 95 Val Ile Ser Glu Ser Lys Gly Val Glu Glu Asp Gln Arg Asp Leu Xa 100 105 110 Xaa 461 base pairs nucleic acid single linear cDNA NO NO 103 CGTACAGCCT CCAGGACCCC CCCTCCCGGG AGAGCCATAG TGGTCTGCGG AACCGGTGAG 60 TACACCGGAA TTGCCGGGAA GACTGGGTCC TTTCTTGGAT TAACCCACTC TATGCCCGGA 120 GATTTGGGCG TGCCCCCGCA AGACTGCTAG CCGAGTAGCG TTGGGTTGCG AAAGGCCTTG 180 TGGTACTGCC TGATAGGGTG CTTGCGAGTG CCCCGGGAGG TCTCGTAGAC CGTGCACCAT 240 GAGCACGAAT CCTAAACCTC AAAGACAAAC CAAAAGAAAC ACCAACCGCC GCCCACAGGA 300 CGTTAAGTTC CCGGGCGGTG GCCAGATCGT TGGCGGGGTG TACTTGTTGC CGCGCAGGGG 360 CCCCAGAGTG GGTGTGCGCG CGACGAGAAA GACCTCGGAG CGGTCCCAGC CGCGTGGGAG 420 GCGCCAACCT ATCCCCAAGG TTAGGCGCAC CACCGGCCGT T 461 74 amino acids amino acid linear peptide 104 Met Ser Thr Asn Pro Lys Pro Gln Arg Gln Thr Lys Arg Asn Thr As 1 5 10 15 Arg Arg Pro Gln Asp Val Lys Phe Pro Gly Gly Gly Gln Ile Val Gl 20 25 30 Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg Val Gly Val Arg Al 35 40 45 Thr Arg Lys Thr Ser Glu Arg Ser Gln Pro Arg Gly Arg Arg Gln Pr 50 55 60 Ile Pro Lys Val Arg Arg Thr Thr Gly Arg 65 70 340 base pairs nucleic acid single linear cDNA NO NO 105 CTCTACTGTC ACAGAGAGGG ATATACGAAC AGAGGAATCC ATYTATCTGG CTTGTCAATT 60 GCCCGAAGAG GCCCGGAAGG CCATCAAATC ACTGACAGAG AGACTATACG TGGGCGGCCC 120 GATGGAAAAC AGCAAGGGCC AGGCCTGCGG ATACAGGCGT TGCCGCGCAA GCGGGGTATT 180 CACCACAAGC TTGGGGAACA CCATGACTTG TTACATCAAA GCCAAGGCGG CTTGTAAAGC 240 CGCTGGCATT GTTGACCCAG TGATGCTCGT GTGCGGCGAC GACCTAGTGG TCATCTCAGA 300 AAGCAAGGGG GTGGAGGAGG ACCAACGAGA CCTACGANTC 340 113 amino acids amino acid linear peptide 106 Ser Thr Val Thr Glu Arg Asp Ile Arg Thr Glu Glu Ser Xaa Tyr Le 1 5 10 15 Ala Cys Gln Leu Pro Glu Glu Ala Arg Lys Ala Ile Lys Ser Leu Th 20 25 30 Glu Arg Leu Tyr Val Gly Gly Pro Met Glu Asn Ser Lys Gly Gln Al 35 40 45 Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Phe Thr Thr Ser Le 50 55 60 Gly Asn Thr Met Thr Cys Tyr Ile Lys Ala Lys Ala Ala Cys Lys Al 65 70 75 80 Ala Gly Ile Val Asp Pro Val Met Leu Val Cys Gly Asp Asp Leu Va 85 90 95 Val Ile Ser Glu Ser Lys Gly Val Glu Glu Asp Gln Arg Asp Leu Ar 100 105 110 Xaa 11 amino acids amino acid linear peptide 107 Ala Arg Gln Ser Asp Gly Arg Ser Trp Ala Gln 1 5 10 11 amino acids amino acid linear peptide 108 Ala Arg Arg Ser Glu Gly Arg Ser Trp Ala Gln 1 5 10 11 amino acids amino acid linear peptide 109 Glu Arg Arg Pro Glu Gly Arg Ser Trp Ala Gln 1 5 10 11 amino acids amino acid linear peptide 110 Ala Arg Arg Pro Glu Gly Arg Ser Trp Ala Gln 1 5 10 11 amino acids amino acid linear peptide 111 Asp Arg Arg Thr Thr Gly Lys Ser Trp Gly Arg 1 5 10 11 amino acids amino acid linear peptide 112 Asp Arg Arg Ala Thr Gly Arg Ser Trp Gly Arg 1 5 10 11 amino acids amino acid linear peptide 113 Asp Arg Arg Ala Thr Gly Lys Ser Trp Gly Arg 1 5 10 11 amino acids amino acid linear peptide 114 Val Arg Gln Pro Thr Gly Arg Ser Trp Gly Gln 1 5 10 11 amino acids amino acid linear peptide 115 Val Arg His Gln Thr Gly Arg Thr Trp Ala Gln 1 5 10 11 amino acids amino acid linear peptide 116 Val Arg Gln Asn Gln Gly Arg Thr Trp Ala Gln 1 5 10 11 amino acids amino acid linear peptide 117 Ala Arg Arg Thr Glu Gly Arg Ser Trp Ala Gln 1 5 10 11 amino acids amino acid linear peptide 118 Val Arg Arg Thr Thr Gly Arg Xaa Xaa Xaa Xaa 1 5 10 11 amino acids amino acid linear peptide 119 Val Arg Arg Thr Thr Gly Arg Thr Trp Ala Gln 1 5 10 12 amino acids amino acid linear peptide 120 His Glu Val Arg Asn Ala Ser Gly Val Tyr His Val 1 5 10 12 amino acids amino acid linear peptide 121 His Glu Val Arg Asn Ala Ser Gly Val Tyr His Leu 1 5 10 12 amino acids amino acid linear peptide 122 Tyr Glu Val His Ser Thr Thr Asp Gly Tyr His Val 1 5 10 12 amino acids amino acid linear peptide 123 Val Glu Val Lys Asn Thr Ser Gln Ala Tyr Met Ala 1 5 10 12 amino acids amino acid linear peptide 124 Ile Gln Val Lys Asn Asn Ser His Phe Tyr Met Ala 1 5 10 12 amino acids amino acid linear peptide 125 Val Gln Val Lys Asn Thr Ser Thr Met Tyr Met Ala 1 5 10 12 amino acids amino acid linear peptide 126 Val Gln Val Lys Asn Thr Ser His Ser Tyr Met Val 1 5 10 12 amino acids amino acid linear peptide 127 Val Gln Val Ala Asn Arg Ser Gly Ser Tyr Met Val 1 5 10 12 amino acids amino acid linear peptide 128 Val Glu Ile Lys Asn Thr Xaa Asn Thr Tyr Val Leu 1 5 10 12 amino acids amino acid linear peptide 129 Val Glu Ile Lys Asn Thr Ser Asn Thr Tyr Val Leu 1 5 10 12 amino acids amino acid linear peptide 130 Ile Asn Tyr Arg Asn Val Ser Gly Ile Tyr Tyr Val 1 5 10 12 amino acids amino acid linear peptide 131 Ile Asn Tyr Arg Asn Thr Ser Gly Ile Tyr His Val 1 5 10 12 amino acids amino acid linear peptide 132 Ile Asn Tyr His Asn Thr Ser Gly Ile Tyr His Ile 1 5 10 12 amino acids amino acid linear peptide 133 Thr Asn Tyr Arg Asn Val Ser Gly Ile Tyr His Val 1 5 10 12 amino acids amino acid linear peptide 134 Gln His Tyr Arg Asn Val Ser Gly Ile Tyr His Val 1 5 10 12 amino acids amino acid linear peptide 135 Ile Gln Val Lys Asn Ala Ser Gly Ile Tyr His Leu 1 5 10 12 amino acids amino acid linear peptide 136 Ala His Tyr Thr Asn Lys Ser Gly Leu Tyr His Leu 1 5 10 12 amino acids amino acid linear peptide 137 Leu Asn Tyr Ala Asn Lys Ser Gly Leu Tyr His Leu 1 5 10 12 amino acids amino acid linear peptide 138 Leu Glu Tyr Arg Asn Ala Ser Gly Leu Tyr Met Val 1 5 10 11 amino acids amino acid linear peptide 139 Ile Tyr Glu Met Asp Gly Met Ile Met His Tyr 1 5 10 11 amino acids amino acid linear peptide 140 Ile Tyr Glu Met Ser Gly Met Ile Leu His Ala 1 5 10 11 amino acids amino acid linear peptide 141 Val Tyr Glu Ala Lys Asp Ile Ile Leu His Thr 1 5 10 11 amino acids amino acid linear peptide 142 Val Trp Gln Leu Xaa Asp Ala Val Leu His Val 1 5 10 11 amino acids amino acid linear peptide 143 Val Trp Gln Leu Arg Asp Ala Val Leu His Val 1 5 10 11 amino acids amino acid linear peptide 144 Ile Trp Gln Met Gln Gly Ala Val Leu His Val 1 5 10 11 amino acids amino acid linear peptide 145 Val Trp Gln Leu Lys Asp Ala Val Leu His Val 1 5 10 11 amino acids amino acid linear peptide 146 Val Trp Gln Leu Glu Glu Ala Val Leu His Val 1 5 10 11 amino acids amino acid linear peptide 147 Thr Trp Gln Leu Xaa Xaa Ala Val Leu His Val 1 5 10 11 amino acids amino acid linear peptide 148 Val Tyr Glu Ala Asp His His Ile Leu His Leu 1 5 10 11 amino acids amino acid linear peptide 149 Val Tyr Glu Ala Asp His His Ile Leu Ala Leu 1 5 10 11 amino acids amino acid linear peptide 150 Val Phe Glu Ala Asp His His Ile Leu His Leu 1 5 10 11 amino acids amino acid linear peptide 151 Val Tyr Glu Ser Asp His His Ile Leu His Leu 1 5 10 10 amino acids amino acid linear peptide 152 Val Phe Glu Glu Thr Met Ile Leu His Leu 1 5 10 11 amino acids amino acid linear peptide 153 Val Tyr Glu Ala Glu Thr Leu Ile Leu His Leu 1 5 10 11 amino acids amino acid linear peptide 154 Val Tyr Glu Ala Asn Gly Met Ile Leu His Leu 1 5 10 11 amino acids amino acid linear peptide 155 Val Tyr Glu Ala Gly Asp Ile Ile Leu His Leu 1 5 10 13 amino acids amino acid linear peptide 156 Val Arg Glu Asp Asn His Leu Arg Cys Trp Met Ala Leu 1 5 10 13 amino acids amino acid linear peptide 157 Val Arg Glu Asn Asn Ser Ser Arg Cys Trp Met Ala Leu 1 5 10 13 amino acids amino acid linear peptide 158 Ile Arg Glu Gly Asn Ile Ser Arg Cys Trp Val Leu Pro 1 5 10 13 amino acids amino acid linear peptide 159 Glu Asn Ser Ser Gly Arg Phe His Cys Trp Ile Pro Ile 1 5 10 13 amino acids amino acid linear peptide 160 Glu Arg Ser Gly Asn Arg Thr Phe Cys Trp Thr Ala Val 1 5 10 13 amino acids amino acid linear peptide 161 Glu Leu Gln Gly Asn Lys Ser Arg Cys Trp Ile Pro Val 1 5 10 13 amino acids amino acid linear peptide 162 Glu Arg His Gln Asn Gln Ser Arg Cys Trp Ile Pro Val 1 5 10 13 amino acids amino acid linear peptide 163 Glu Trp Lys Asp Asn Thr Ser Arg Cys Trp Ile Pro Val 1 5 10 13 amino acids amino acid linear peptide 164 Glu Arg Glu Gly Asn Ser Ser Arg Cys Trp Ile Pro Val 1 5 10 13 amino acids amino acid linear peptide 165 Val Arg Glu Gly Asn Gln Ser Arg Cys Trp Val Ala Leu 1 5 10 13 amino acids amino acid linear peptide 166 Val Arg Thr Gly Asn Gln Ser Arg Cys Trp Val Ala Leu 1 5 10 13 amino acids amino acid linear peptide 167 Val Arg Val Gly Asn Gln Ser Ser Cys Trp Val Ala Leu 1 5 10 13 amino acids amino acid linear peptide 168 Val Arg Val Gly Asn Gln Ser Arg Cys Trp Val Ala Leu 1 5 10 13 amino acids amino acid linear peptide 169 Val Lys Glu Gly Asn His Ser Arg Cys Trp Val Ala Leu 1 5 10 13 amino acids amino acid linear peptide 170 Val Lys Thr Gly Asn Thr Ser Arg Cys Trp Val Ala Leu 1 5 10 13 amino acids amino acid linear peptide 171 Ile Lys Ala Gly Asn Glu Ser Arg Cys Trp Leu Pro Val 1 5 10 13 amino acids amino acid linear peptide 172 Val Lys Xaa Xaa Asn Gln Ser Arg Cys Trp Val Gln Ala 1 5 10 13 amino acids amino acid linear peptide 173 Val Lys Thr Gly Asn Leu Thr Lys Cys Trp Leu Ser Ala 1 5 10 13 amino acids amino acid linear peptide 174 Val Arg Ser Gly Asn Thr Ser Arg Cys Trp Ile Pro Val 1 5 10 10 amino acids amino acid linear peptide 175 Val Lys Asn Ala Ser Val Pro Thr Ala Ala 1 5 10 10 amino acids amino acid linear peptide 176 Val Lys Asp Ala Asn Val Pro Thr Ala Ala 1 5 10 10 amino acids amino acid linear peptide 177 Ala Arg Ile Ala Asn Ala Pro Ile Asp Glu 1 5 10 10 amino acids amino acid linear peptide 178 Val Ser Lys Pro Gly Ala Leu Thr Lys Gly 1 5 10 10 amino acids amino acid linear peptide 179 Val Ser Arg Pro Gly Ala Leu Thr Arg Gly 1 5 10 10 amino acids amino acid linear peptide 180 Val Asn Gln Pro Gly Ala Leu Thr Arg Gly 1 5 10 10 amino acids amino acid linear peptide 181 Val Ser Gln Pro Gly Ala Leu Thr Arg Gly 1 5 10 10 amino acids amino acid linear peptide 182 Val Ser Gln Pro Gly Ala Leu Thr Lys Gly 1 5 10 10 amino acids amino acid linear peptide 183 Val Ser Arg Pro Gly Ala Leu Thr Glu Gly 1 5 10 10 amino acids amino acid linear peptide 184 Ala Pro Tyr Ile Gly Ala Pro Leu Glu Ser 1 5 10 10 amino acids amino acid linear peptide 185 Ala Pro Tyr Thr Ala Ala Pro Leu Glu Ser 1 5 10 10 amino acids amino acid linear peptide 186 Ala Pro Ile Leu Ser Ala Pro Leu Met Ser 1 5 10 10 amino acids amino acid linear peptide 187 Val Pro Asn Ser Ser Val Pro Ile His Gly 1 5 10 10 amino acids amino acid linear peptide 188 Val Pro Asn Ala Ser Thr Pro Val Thr Gly 1 5 10 10 amino acids amino acid linear peptide 189 Val Gln Asn Ala Ser Val Ser Ile Arg Gly 1 5 10 10 amino acids amino acid linear peptide 190 Val Lys Ser Pro Cys Ala Ala Thr Ala Ser 1 5 10 10 amino acids amino acid linear peptide 191 Ser Pro Arg Met His His Thr Thr Gln Glu 1 5 10 10 amino acids amino acid linear peptide 192 Ser Pro Arg Leu Tyr His Thr Thr Gln Glu 1 5 10 10 amino acids amino acid linear peptide 193 Thr Ser Arg Arg His Trp Thr Val Gln Asp 1 5 10 10 amino acids amino acid linear peptide 194 Ala Pro Lys Arg His Tyr Phe Val Gln Glu 1 5 10 10 amino acids amino acid linear peptide 195 Ser Pro Gln Tyr His Thr Phe Val Gln Glu 1 5 10 10 amino acids amino acid linear peptide 196 Ser Pro Gln His His Asn Phe Ser Gln Asp 1 5 10 10 amino acids amino acid linear peptide 197 Ser Pro Gln His His Ile Phe Val Gln Asp 1 5 10 10 amino acids amino acid linear peptide 198 Ser Pro Glu His His His Phe Val Gln Asp 1 5 10 10 amino acids amino acid linear peptide 199 Arg Pro Arg Arg His Trp Thr Thr Gln Asp 1 5 10 10 amino acids amino acid linear peptide 200 Arg Pro Arg Arg His Trp Thr Ala Gln Asp 1 5 10 10 amino acids amino acid linear peptide 201 Gln Pro Arg Arg His Trp Thr Thr Gln Asp 1 5 10 10 amino acids amino acid linear peptide 202 Arg Pro Arg Arg His Trp Thr Thr Gln Glu 1 5 10 10 amino acids amino acid linear peptide 203 Gln Pro Arg Arg His Trp Thr Val Gln Asp 1 5 10 10 amino acids amino acid linear peptide 204 Arg Pro Lys Tyr His Gln Val Thr Gln Asp 1 5 10 10 amino acids amino acid linear peptide 205 Arg Pro Arg Met His Gln Val Val Gln Glu 1 5 10 10 amino acids amino acid linear peptide 206 Arg Pro Arg Met Tyr Glu Ile Ala Gln Asp 1 5 10 10 amino acids amino acid linear peptide 207 Arg His Arg Gln His Trp Thr Val Gln Asp 1 5 10 

1. An HCV polynucleic acid, having a nucleotide sequence which is unique to a theretofore unidentified HCV type or subtype which is different from HCV subtypes 1a, 1b, 1c, 2a, 2b, 2c, 2d, 3a, 3b, 3c, 3d, 3e, 3f, 4a, 4b, 4c, 4d, 4e, 4f, 4g, 4h, 4i, 4j, 5a or 6a, with said HCV subtypes being classified as in Table 3 by comparison of a part of the NS5 gene nucleotide sequence spanning positions 7932 to 8271, with said amino acid numbering being shown in Table 1, and with said polynucleic acid containing at least one nucleotide differing from said known HCV nucleotide sequences, or the complement thereof.
 2. A polynucleic acid according to claim 1, having a nucleotide sequence which is unique to at least one of HCV subtypes 1d, 1e, 1f, 1g, 2e, 2f, 2g, 2h, 2i, 2k, 2l, 3g, 4k, 4l, 4m, 7a, 7c or 7d, with said HCV subtypes being classified as defined in claim
 1. 3. A polynucleic acid according to claim 1, having a nucleotide sequence which is unique to at least one of HCV types 9, 10 or 11, with said HCV types being classified as defined in claim
 1. 4. A polynucleic acid according to any of claims 1 to 3 encoding an HCV polyprotein comprising in its amino acid sequence at least one of the following amino acid residues: I15, C38, V44, A49, Q43, P49, Q55, A58, S60 or D60, E68 or V68, H70,A71 or Q71 or N71, D72, H81, H101, D106, S110, L130, I134, E135, L140, S148, T150 or E150, Q153, F155, D157, G160, E165, I169, F181, L186, T190, T192 or I192 or H192, I193, A195, S196, R197 or N197 or K197, Q199 or D199 or H199 or N199, F200 or T200, A208, I213, M216 or S216, N217 or S217 or G217 or K217, T218, I219, A222, Y223, I230, W231 or L231, S232 or H232 or A232, Q233, E235 or L235, F236 or T236, F237, L240 or M240, A242, N244, N249, I250 or K250 or R250, A252 or C252, A254, I255 or V255, D256 or M256, E257, E260 or K260, R261, V268, S272 or R272, I285, G290 or F290, A291, A293 or L293 or W293, T294 or A294, S295 or H295, K296 or E296, Y297 or M297, I299 or Y299, I300, S301, P316, S2646, A2648, G2649, A2650, V2652, Q2653, H2656 or L2656, D2657, F2659, K2663 or Q2663, A2667 or V1667, D2677, L2681, M2686 or Q2686 or E2686, A2692 or K2692, H2697, I2707, L2708 or Y2708, A2709, A2719 or M2719, F2727, T2728 or D2728, E2729, F2730 or Y2730, I2741, I2745, V2746 or E2746 or L2746 or K2746, A2748, S2749 or P2749, R2750, E2751, D2752 or N2752 or S2752 or T2752 or V2752 or I2752 or Q2752, S2753 or D2753 or G2753, D2754, A2755, L2756 or Q2756, R2757, with said notation being composed of a letter representing the amino acid residue by its one-letter code, and a number representing the amino acid numbering as shown in Table 1, or a part of said polynucleic acid which is unique to at least one of the HCV subtypes or types as defined in claims 2 to 3, and which contains at least one nucleotide differing from known HCV nucleotide sequences, or the complement thereof.
 5. A polynucleic acid according to any of claims 1 to 4, with said polynucleic acid encoding a HCV polyprotein comprising in its amino acid sequence at least one amino acid sequence chosen from the following list: ARQSDGRSWAQ or ARRSEGRSWAQ as for subtype 1d (SEQ ID NO 107 and 108) ERRPEGRSWAQ as for subtype 1e (SEQ ID NO 109) ARRPEGRSWAQ as for subtype 1f (SEQ ID NO 110) DRRTTGKSWGR as for subtype 2k (SEQ ID NO 111) DRRATGRSWGR as for subtype 2e (SEQ ID NO 112) DRRATGKSWGR as for subtype 2f (SEQ ID NO 113) VRQPTGRSWGQ as for type 9 (SEQ ID NO 114) VRHQTGRTWAQ as for subtype 7a and 7c (SEQ ID NO 115) VRQNQGRTWAQ as for subtype 7d (SEQ ID NO 116) ARRTEGRSWAQ as for type 10 (SEQ ID NO 117) VRRTTGRXXXX or VRRTTGRTWAQ as for type 11 (SEQ ID NO 118 and 119) HEVRNASGVYHV or HEVRNASGVYHL as for subtype 1d (SEQ ID NO 120 and 121) YEVHSTTDGYHV as for subtype 1f (SEQ ID NO 122) VEVKNTSQAYMA as for subtype 2e (SEQ ID NO 123) IQVKNNSHFYMA as for subtype 2f (SEQ ID NO 124) VQVKNTSTMYMA as for subtype 2g (SEQ ID NO 125) VQVKNTSHSYMV as for subtype 2h (SEQ ID NO 126) VQVANRSGSYMV as for subtype 2i (SEQ ID NO 127) VEIKNTXNTYVL or VEIKNTSNTYVL as for subtype 2k (SEQ ID NO 128 and 129) INYRNVSGIYYV or INYRNTSGIYHV or INYHNTSGIYHI or TNYRNVSGIYHV as for subtype 4k (SEQ ID NO 130, 131, 132 or 133) QHYRNVSGIYHV as for subtype 4l (SEQ ID NO 134) IQVKNASGIYHL as for type 9 (SEQ ID NO 135) AHYTNKSGLYHL as for subtype 7c (SEQ ID NO 136) LNYANKSGLYHL as for subtype 7d (SEQ ID NO 137) LEYRNASGLYMV as for type 10 (SEQ ID NO 138) IYEMDGMIMHY or IYEMSGMILHA as for subtype 1d (SEQ ID NO 139 and 140) VYEAKDIILHT as for subtype 1f (SEQ ID NO 141) VWQLXDAVLHV as for subtype 2e (SEQ ID NO 142) VWQLRDAVLHV as for subtype 2f (SEQ ID NO 143) IWQMQGAVLHV as for subtype 2g (SEQ ID NO 144) VWQLKDAVLHV as for subtype 2h (SEQ ID NO 145) VWQLEEAVLHV as for subtype 2i (SEQ ID NO 146) TWQLXXAVLHV as for subtype 2k (SEQ ID NO 147) VYEADHHILHL or VYEADHHILAL or VFEADHHILHL as for subtupe 4k (SEQ ID NO 148, 149 and 150) VYESDHHILHL as for subtype 4l (SEQ ID NO 151) VFEAETMILHL as for type 9 (SEQ ID NO 152) VYEAETLILHL as for subtype 7c (SEQ ID NO 153) VYEANGMILHL as for subtype 7d (SEQ ID NO 154) VYEAGDIILHL as for type 10 (SEQ ID NO 155) VREDNHLRCWMAL or VRENNSSRCWMAL as for subtype 1d (SEQ ID NO 156 and 157) IREGNISRCWVPL as for subtype 1f (SEQ ID NO 158) ENSSGRFHCWIPI as for subtype 2e (SEQ ID NO 159) ERSGNRTFCWTAV as for subtype 2f (SEQ ID NO 160) ELQGNKSRCWIPV as for subtype 2g (SEQ ID NO 162) ERHQNQSRCWIPV as for subtype 2h (SEQ ID NO 163) EWKDNTSRCWIPV as for subtype 2i (SEQ ID NO 164) EREGNSSRCWIPV as for subtype 2k (SEQ ID NO 165) VREGNQSRCWVAL or VRTGNQSRCWVAL or VRVGNQSSCWVAL or VRVGNQSRCWVAL or VKEGNHSRCWVAL as (SEQ ID NO 166, for subtype 4k 167, 168 or 169) VKTGNTSRCWVAL as for subtype 4l (SEQ ID NO 170) IKAGNESRCWLPV as for type 9 (SEQ ID NO 171) VKEGNQSRCWVQA as for subtype 7c (SEQ ID NO 172) VKXXNLTKCWLSA as for subtype 7d (SEQ ID NO 173) VRSGNTSRCWIPV as for type 10 (SEQ ID NO 174) VKNASVPTAA or VKDANVPTAA as for subtype 1d (SEQ ID NO 175 and 176) ARIANAPIDE as for subtype 1f (SEQ ID NO 177) VSKPGALTKG as for subtype 2e (SEQ ID NO 178) VSRPGALTRG as for subtype 2f (SEQ ID NO 179) VNQPGALTRG as for subtype 2g (SEQ ID NO 180) VSQPGALTRG as for subtype 2h (SEQ ID NO 181) VSQPGALTKG as for subtype 2i (SEQ ID NO 182) VSRPGALTEG as for subtype 2k (SEQ ID NO 183) APYIGAPLES or APYTAAPLES as for subtype 4k (SEQ ID NO 184 and 185) APILSAPLMS as for subtype 4l (SEQ ID NO 186) VPNSSVPIHG as for type 9 (SEQ ID NO 187) VPNASTPVTG as for subtype 7c (SEQ ID NO 188) VQNASVSIRG as for subtype 7d (SEQ ID NO 189) VKSPCAATAS as for type 10 (SEQ ID NO 190) SPRMHHTTQE or SPRLYHTTQE as for subtype 1d (SEQ ID NO 191 and 192) TSRRHWTVQD as for subtype 1f (SEQ ID NO 193) APKRHYFVQE as for subtype 2e (SEQ ID NO 194) SPQYHTFVQE as for subtype 2f (SEQ ID NO 195) SPQHHNFSQD as for subtype 2g (SEQ ID NO 196) SPQHHIFVQD as for subtype 2h (SEQ ID NO 197) SPEHHHFVQD as for subtype 2k (SEQ ID NO 198) RPRRHWTTQD or RPRRHWTAQD or QPRRHWTTQD or RPRRHWTTQE as for subtype 4k (SEQ ID NO 199, 200, 201 or 202) QPRRHWTVQD as for subtype 4l (SEQ ID NO 203) RPKYHQVTQD as for type 9 (SEQ ID NO 204) RPRMHQVVQE as for subtype 7c (SEQ ID NO 205) RPRMYEIAQD as for subtype 7d (SEQ NO 206) RHRQHWTVQD as for type 10 (SEQ ID NO 207)

or a part of said polynucieic acid which is unique to at least one of the HCV subtypes or types as defined in claims 2 to 3, and which contains at least one nucleotide differing from known HCV nucleotide sequences, or the complement thereof.
 6. A polynucleic acid according to any of claims 1 to 5 having a sequence selected from any of SEQ ID NO 1 to 105, or a part of said polynucleic acid which is unique to at least one of the HCV subtypes or types as defined in claims 2 to 3, and which contains at least one nucleotide differing from known HCV nucleotide sequences, or the complement thereof.
 7. A polynucleic acid according to any of claims 1 to 6, which codes for the 5′ UR, the Core/E1, the NS4 or the NS5B region or a part thereof.
 8. A polynucleic acid according to any of claims 1 to 7 which is a cDNA sequence.
 9. An oligonucleotide primer comprising part of a polynucleic acid according to any of claims 1 to 8, with said primer being able to act as primer for specifically amplifying the nucleic acid of a certain isolate belonging to the genotype from which the primer is derived.
 10. An oligonucieotide probe comprising part of a polynucleic acid according to any of claims 1 to 8, with said probe being able to act as a hybridization probe for specific detection and/or classification into types and/or subtypes of a HCV nucleic acid containing said nucleotide sequence, with said probe being possibly labelled or attached to a solid substrate.
 11. A diagnostic kit for use in determing the genotype of HCV, said kit comprising a primer according to claim
 9. 12. A diagnostic kit for use in determining the genotype of HCV, said kit comprising a probe according to claim
 10. 13. A diagnostic kit according to claim 12, wherein said probe(s) is(are) attached to a solid substrate.
 14. A diagnostic kit according to claim 13, wherein a range of said probes are attached to specific locations on a solid substrate.
 15. A diagnostic kit according to claim 14, wherein said solid support is a membrane strip and said probes are coupled to the membrane in the form of parallel lines.
 16. A method for the detection of HCV nucleic acids present in a biological sample, comprising: (i) possibly extracting sample nucleic acid, (ii) amplifying the nucleic acid with at least one primer according to claim 9, (iii) detecting the amplified nucleic acids.
 17. A method for the detection of HCV nucleic acids present in a biological sample, comprising: (i) possibly extracting sample nucleic acid, (ii) possibly amplifying the nucleic acid with at least one primer according to claim 9, or with a universal HCV primer, (iii) hybridizing the nucleic acids of the biological sample, possibly under denatured conditions, at appropriate conditions with one or more probes according to claim 10, with said probes being possibly attached to a solid substrate, (iv) possibly washing at appropriate conditions, (v) detecting the hybrids formed.
 18. A method for detecting the presence of one or more HCV genotypes present in a biological sample, comprising: (i) possibly extracting sample nucleic acid, (ii) specifically amplifying the nucleic acid with at least one primer according to claim 9, (iii) detecting said amplified nucleic acids, (iv) inferring the presence of one or more genotypes of HCV present from the observed pattern of amplified fragments.
 19. A method for detecting the presence of one or more HCV genotypes present in a biological sample, comprising: (i) possibly extracting sample nucleic acid, (ii) possibly amplifying the nucleic acid with at least one primer according to claim 9 or with a universal HCV primer, (iii) hybridizing the nucleic acids of the biological sample, possibly under denatured conditions, at appropriate conditions with one or more probes according to claim 10, with said probes being possibly attached to a solid substrate, (iv) possibly washing at appropriate conditions, (v) detecting the hybrids formed, (vi) inferring the presence of one or more HCV genotypes present from the observed hybridization pattern.
 20. A method according to claim 19, wherein said probes are further characterized as defined in any of claims 13 to
 15. 21. A method according to claims 16 to 18, wherein said nucleic acids are labelled during or after amplification.
 22. A polypeptide having an amino acid sequence encoded by a polynucleic acid according to any of claims 1 to 8, or a part thereof which is unique to at least one of the HCV subtypes or types as defined in claims 2 or 3, and which contains at least one amino acid differing from any of the known HCV types or subtypes amino acid sequences, or an analog thereof being substantially homologous and biologically equivalent.
 23. A polypeptide according to claim 22 comprising in its amino acid sequence at least one of the following amino acid residues: I15, C38, V44, A49, Q43, P49, Q55, A58, S60 or D60, E68 or V68, H70, A71 or Q71 or N71, D72, H81, H101, D106, S110, L130, I134, E135, L140, S148, T150 or E150, Q153, F155, D157, G160, E165, I169, F181, L186, T190, T192 or I192 or H192, I193, A195, S196, R197 or N197 or K197, Q199 or D199 or H199 or N199, F200 or T200, A208, I213, M216 or S216, N217 or S217 or G217 or K217, T218, I219, A222, Y223, I230, W231 or L231, S232 or H232 or A232, Q233, E235 or L235, F236 or T236, F237, L240 or M240, A242, N244, N249, I250 or K250 or R250, A252 or C252, A254, I255 or V255, D256 or M256, E257, E260 or K260, R261, V268, S272 or R272, I285, G290 or F290, A291, A293 or L293 or W293, T294 or A294, S295 or H295, K296 or E296, Y297 or M297, I299 or Y299, I300, S301, P316, S2646, A2648, G2649, A2650, V2652, Q2653, H2656 or L2656, D2657, F2659, K2663 or Q2663, A2667 or V2667, D2677, L2681, M2686 or Q2686 or E2686, A2692 or K2692, H2697, I2707, L2708 or Y2708, A2709, A2719 or M2719, F2727, T2728 or D2728, E2729, F2730 or Y2730, I2741, I2745, V2746 or E2746 or L2746 or K2746, A2748, S2749 or P2749, R2750, E2751, D2752 or N2752 or S2752 or T2752 or V2752 or I2752 or Q2752, S2753 or D2753 or G2753, D2754, A2755, L2756 or Q2756, or R2757, with said notation being composed of a letter representing the amino acid residue by its one-letter code, and a number representing the amino acid numbering as shown in Table 1, or a part of said polypeptide which is unique to at least one of the HCV subtypes or types as defined in claims 2 to 3, and which contains at least one amino acid differing from known HCV types or subtypes amino acid sequences, or an analog thereof being substantially homologous and biologically equivalent to said polypeptide.
 24. A polypeptide according to claim 22 comprising in its amino acid sequence at least one of the sequences represented by SEQ ID NO 107 to 207 as listed in claim 5, or part of said polypeptide which is unique to at least one of the HCV subtypes or types as defined in claims 2 to 3, and which contains at least one amino acid differing from known HCV types or subtypes amino acid sequences, or an analog thereof being substantially homologous and biologically equivalent to said polypeptide.
 25. A polypeptide having an amino acid sequence as represented in any of SEQ ID NO 1 to 106, or a part thereof which is unique to at least one of the HCV subtypes or types as defined in claims 2 to 3, and which contains at least one amino aced differing from known HCV types or subtypes amino acid sequences, or an analog thereof being substantially homologous and biologically equivalent to said polypeptide.
 26. A recombinant polypeptide encoded by a polynucleic acid according to any of claims 1 to 8, or a part thereof which is unique to at least one of the HCV subtypes or types as defined in claims 2 or 3, and which contains at least one amino acid differing from known HCV types or subtypes amino acid sequences, or an analog thereof being substantially homologous and biologically equivalent to said polypeptide.
 27. A method for production of a recombinant polypeptide of claim 26, comprising: transformation of an appropriate cellular host with a recombinant vector, in which a polynucleic acid or a part thereof according to any of claims 1 to 8 has been inserted under the control of the appropriate regulatory elements, culturing said transformed cellular host under conditions enabling the expression of said insert, and, harvesting said polypeptide.
 28. A recombinant expression vector comprising a polynucleic acid or a part thereof according to any of claims 1 to 8 operably linked to prokaryotic, eukaryotic or viral transcription and translation control elements.
 29. A host cell transformed with a recombinant vector according to claim
 28. 30. A method for detecting antibodies to HCV present in a biological sample, comprising: (i) contacting the biological sample to be analysed for the presence of HCV with a polypeptide according to any of claims 22 to 26, (ii) detecting the immunological complex formed between said antibodies and said polypeptide.
 31. A method for HCV typing, comprising: (i) contacting the biological sample to be analysed for the presence of HCV with a polypeptide according to any of claims 22 to 26, (ii) detecting the immunological complex formed between said antibodies and said polypeptide.
 32. A diagnostic kit for use in detecting the presence of HCV, said kit comprising at least one polypeptide according to any of claims 22 to 26, with said polypeptide being possibly bound to a solid support.
 33. A diagnostic kit for HCV typing, said kit comprising at least one polypeptide according to any of claims 22 to 26, with said polypeptide being possibly bound to a solid support.
 34. A diagnostic kit according to claims 32 to 33, said kit comprising a range of polypeptide which are attached to specific locations on a solid substrate.
 35. A diagnostic kit according to claims 32 to 34, wherein said solid support is a membrane strip and said polypeptides are coupled to the membrane in the form of parallel lines.
 36. A pharmaceutical composition comprising at least one polypeptide according to any of claims 22 to 26 and a suitable excipient, diluent or carrier.
 37. A method of preventing HCV infection, comprising administering the pharmaceutical compositon of claim 36 to a mammal in effective amount to stimulate the production of protective antibody or protective T-cell response.
 38. Use of a composition according to claim 36 in a method for preventing HCV infection as defined in claim
 37. 39. A vaccine for immunizing a mammal against HCV infection, comprising at least one polypeptide according to claims 22 to 26, in a pharmaceutically acceptable carrier.
 40. A vaccine according to claim 39, comprising at least one polypeptide according to claims 22 to 26, with said polypeptide being unique for at least one of the HCV subtypes as defined in claims 2 or
 3. 41. A peptide corresponding to an amino acid sequence encoded by at least one of the HCV polynucleic acids according to any of claims 1 to 8, with said peptide comprising an epitope being unique to at least one of the HCV subtypes or types as defined in claims 2 or 3, and with said peptide containing at least one amino acid differing from any of the known HCV types or subtypes amino acid sequences, or an analog thereof being substantially homologous and biologically equivalent.
 42. A method for detecting antibodies to HCV present in a biological sample, comprising: (i) contacting the biological sample to be analysed for the presence of HCV with a peptide according to claim 41, (ii) detecting the immune complex formed between said antibodies and said peptide.
 43. A method for HCV typing, comprising: (i) contacting the biological sample to be analysed for the presence of HCV with a peptide according to claim 41, (ii) detecting the immune complex formed between said antibodies and said peptide.
 44. A. diagnostic kit for use in detecting the presence of HCV, said kit comprising at least one peptide according to claim 41, with said peptide being possibly bound to a solid support.
 45. A diagnostic kit for HCV typing, said kit comprising at least one peptide according to any of claim 41, with said peptide being possibly bound to a solid support.
 46. A diagnostic kit according to claims 44 or 45, wherein said peptides are selected from the following list: at least one NS4 peptide, at least one NS4 peptide and at least one Core peptide, at least one NS4 peptide and at least one Core peptide and at least one E1 peptide, or, at least one NS4 peptide and at least one E1 peptide.
 47. A Diagnostic kit according to claims 44 to 46, said kit comprising a range of peptides which are attached to specific locations on a solid substrate.
 48. A diagnostic kit according to claims 44 to 47, wherein said solid support is a membrane strip and said peptides are coupled to the membrane in the form of parallel lines.
 49. A pharmaceutical composition comprising at least one peptide according to claim 41 and suitable excipient, diluent or carrier.
 50. A method of preventing HCV infection, comprising administering the pharmaceutical composition of claim 49 to a mammal in effective amount to stimulate the production of protective antibody or protective T-cell response.
 51. Use of a composition according to claim 49 in a method for preventing HCV infection as defined in claim
 50. 52. A vaccine for immunizing a mammal against HCV infection, comprising at least one peptide according to claim 41, in a pharmaceutically acceptable carrier.
 53. A vaccine according to claim 52, comprising at least one peptide according to claim 41, with said peptide being unique for at least one of the subtypes or types as defined in claims 2 or
 3. 54. An antibody raised upon immunization with at least one polypeptide or peptide according to any of claims 22 to 26 or 41, with said antibody being specifically reactive with any of said polypeptides or peptides, and with said antibody being preferably a monoclonal antibody.
 55. A method for detecting HCV antigens present in a biological sample, comprising: (i) contacting said biological sample with an antibody according to claim 54, (ii) detecting the immune complexes formed between said HCV antigens and said antibody.
 56. A method for HCV typing, comprising: (i) contacting said biological sample with an antibody according to claim 54, (ii) detecting the immune complexes formed between said HCV antigens and said antibody.
 57. A diagnostic kit for use in detecting the presence of HCV, said kit comprising at least one antibody according to claim 54, with said antibody being possibly bound to a solid support.
 58. A diagnostic kit for HCV typing, said kit comprising at least one antibody according to claim 54, with said antibody being possibly bound to a solid support.
 59. A diagnostic kit according to claims 57 to 58, said kit comprising a range of antibodies which are attached to specific locations on a solid substrate.
 60. A pharmaceutical composition comprising at least one antibody according to claim 54 and a suitable excipient, diluent or carrier.
 61. A method of preventing or treating HCV infection, comprising administering the pharmaceutical composition of claim 62 to a mammal in effective amount.
 62. Use of a composition according to claim 60 in a method for preventing or treating HCV infection as defined in claim
 61. 